You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are processing the data Marathon gathers from the tests execution and we want to visualize it to analyze how we are performing.
For test duration it is great but I'm struggling to understand how the flakiness is calculated. If I am not wrong, it is only possible to calculate the failure rate for a given period of time but there is no way to distinguish between flakiness and (expected) failures, right? For instance, if I change the production code but do not adapt the test, it is not flaky, it is expected to fail. However, for a given state (sha for instance) multiple results mean flakiness. As far as I know this statistic can't be obtained, right? Marathon is not associating git-related information to the test execution.
This also makes me question how the flakiness strategy is implemented, is mitigating flakiness based on the failure rate and not on actual flakiness?
Maybe I'm wrong on my conclusion so please any extra information I may missed is more than welcome.
The text was updated successfully, but these errors were encountered:
Your assumption is correct, Marathon doesn't integrate with any VCS, including git, so it doesn't associate git related metadata out-of-the-box.
To understand how flakiness strategy is implemented I think it's best to look at the code, because it's the single source of truth. Happy to answer any specific questions though.
We are processing the data Marathon gathers from the tests execution and we want to visualize it to analyze how we are performing.
For test duration it is great but I'm struggling to understand how the flakiness is calculated. If I am not wrong, it is only possible to calculate the failure rate for a given period of time but there is no way to distinguish between flakiness and (expected) failures, right? For instance, if I change the production code but do not adapt the test, it is not flaky, it is expected to fail. However, for a given state (sha for instance) multiple results mean flakiness. As far as I know this statistic can't be obtained, right? Marathon is not associating git-related information to the test execution.
This also makes me question how the flakiness strategy is implemented, is mitigating flakiness based on the failure rate and not on actual flakiness?
Maybe I'm wrong on my conclusion so please any extra information I may missed is more than welcome.
The text was updated successfully, but these errors were encountered: