-
-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764
Comments
For context, this issue was created based on a discussion in the retrospective where I suggested that we could consider grabbing the hostname of where the initial run occurred and set ADDITIONAL_LABEL=!hostname, but before making such a change, we could actually look at the some of the metrics around auto_reruns in TRSS (related: #5121), whether such a change is actually needed or if many of the auto_reruns naturally land on different machines or whether the intermittent failures we have at the project are less likely to be machine-related causes. |
I've written a program to provide some numbers for/against this proposal. It looks at the last 10 pipelines per LTS version, identifies all rerun tests per build, and compares the host names (and also logs the pass/fail of the rerun). The program is taking a while to run, but I can see the progress it's making and will update this issue with the results in a minute. Here is the source: And here is the output: Tests which failed and reran on the same host: 84 So the percentages seem to indicate that a different host is best. |
Please add me to this task as an assignee, and change the project to the Q4 one. Also, I'll be training tomorrow, so others can feel free to add their name too for further discussion and/or pr creation. Ta very much. :) |
Great :) One other dimension to this is that we have now enabled taking 'problem machines' offline if a certain type of failure occurs, so it would not be available to send the rerun job too. It'd be good to look at those that were sent to the same machine and failed in the rerun, to see the nature of the failures (would those failures now trigger taking those machines offline). |
Summary
Proposal for the auto-rerun test feature to mandate (where possible) that the reruns happen on a different host.
Details
If a specific unit test failure is caused by something specific to a particular host, this change allows us to avoid that problem.
Statistics Summary
Tests which failed and reran on the same host: 84
...of which this many failed: 78
Tests which failed and reran on a different host: 173
...of which this many failed: 124
Statistics Source
RerunStatsByHost.groovy.txt
The text was updated successfully, but these errors were encountered: