Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764

adamfarley · 2024-11-20T15:03:27Z

Summary
Proposal for the auto-rerun test feature to mandate (where possible) that the reruns happen on a different host.

Details
If a specific unit test failure is caused by something specific to a particular host, this change allows us to avoid that problem.

Statistics Summary
Tests which failed and reran on the same host: 84
...of which this many failed: 78
Tests which failed and reran on a different host: 173
...of which this many failed: 124

Statistics Source

RerunStatsByHost.groovy.txt

smlambert · 2024-11-20T21:09:38Z

For context, this issue was created based on a discussion in the retrospective where I suggested that we could consider grabbing the hostname of where the initial run occurred and set ADDITIONAL_LABEL=!hostname, but before making such a change, we could actually look at the some of the metrics around auto_reruns in TRSS (related: #5121), whether such a change is actually needed or if many of the auto_reruns naturally land on different machines or whether the intermittent failures we have at the project are less likely to be machine-related causes.

adamfarley · 2024-11-21T15:22:58Z

I've written a program to provide some numbers for/against this proposal.

It looks at the last 10 pipelines per LTS version, identifies all rerun tests per build, and compares the host names (and also logs the pass/fail of the rerun).

The program is taking a while to run, but I can see the progress it's making and will update this issue with the results in a minute.

Here is the source:
RerunStatsByHost.groovy.txt

And here is the output:

Tests which failed and reran on the same host: 84
...of which this many failed: 78
Tests which failed and reran on a different host: 173
...of which this many failed: 124

So the percentages seem to indicate that a different host is best.

adamfarley · 2024-11-21T16:51:30Z

Please add me to this task as an assignee, and change the project to the Q4 one.

Also, I'll be training tomorrow, so others can feel free to add their name too for further discussion and/or pr creation.

Ta very much. :)

smlambert · 2024-11-21T22:05:18Z

Great :)

One other dimension to this is that we have now enabled taking 'problem machines' offline if a certain type of failure occurs, so it would not be available to send the rerun job too. It'd be good to look at those that were sent to the same machine and failed in the rerun, to see the nature of the failures (would those failures now trigger taking those machines offline).

github-project-automation bot added this to Adoptium Backlog Nov 20, 2024

github-project-automation bot moved this to Todo in Adoptium Backlog Nov 20, 2024

adamfarley changed the title ~~Feature Proposal: Where possible, make auto-rerun tests run on a different machine~~ WIP: Feature Proposal: Where possible, make auto-rerun tests run on a different machine Nov 20, 2024

adamfarley changed the title ~~WIP: Feature Proposal: Where possible, make auto-rerun tests run on a different machine~~ Feature Proposal: Where possible, make auto-rerun tests run on a different machine Nov 21, 2024

adamfarley moved this from Todo to In Progress in Adoptium Backlog Nov 21, 2024

smlambert assigned adamfarley Nov 21, 2024

smlambert added this to 2024 4Q Adoptium Plan Nov 21, 2024

smlambert moved this to In Progress in 2024 4Q Adoptium Plan Nov 21, 2024

smlambert removed this from Adoptium Backlog Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764

Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764

adamfarley commented Nov 20, 2024 •

edited

Loading

smlambert commented Nov 20, 2024

adamfarley commented Nov 21, 2024 •

edited

Loading

adamfarley commented Nov 21, 2024

smlambert commented Nov 21, 2024

Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764

Feature Proposal: Where possible, make auto-rerun tests run on a different machine #5764

Comments

adamfarley commented Nov 20, 2024 • edited Loading

smlambert commented Nov 20, 2024

adamfarley commented Nov 21, 2024 • edited Loading

adamfarley commented Nov 21, 2024

smlambert commented Nov 21, 2024

adamfarley commented Nov 20, 2024 •

edited

Loading

adamfarley commented Nov 21, 2024 •

edited

Loading