-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
azure: split slow linux builders #61370
Conversation
This comment has been minimized.
This comment has been minimized.
cbaa6fb
to
58025d1
Compare
Fixed the build failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat tempted to hold off on this if we can, it seems like it's going to be hard to manage what's split and what isn't over time. Additionally in terms of resources this is wasteful in the sense that the split builds have the same first halves.
I think we should probably first explore disabling assertions where possible and see if we can recover enough time that way. Distcheck for example has no need to turn on assertions, same with builders like cargo. I was hoping to look more into this today but it's looking like I may not have enough time today
x86_64-gnu-aux: | ||
IMAGE: x86_64-gnu-aux | ||
SCRIPT: make check-aux EXCLUDE_CARGO=1 | ||
|
||
x86_64-gnu-cargo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we forget to include this from before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, on Travis x86_64-gnu-cargo
is included inside x86_64-gnu-aux
. Instead on AppVeyor, to speed things up, cargo is tested inside its own job. I replicated the same thing we do on AppVeyor in Azure Linux.
Ok, let's wait on the results of disabling assertions. @rustbot modify labels: +S-blocked -S-waiting-on-review |
Ok so to say again that I haven't forgotten about this:
I'll circle back tomorrow with timings after that all runs overnight. |
OK scratch that. More spurious failures basically invalidated that run, so there's... |
Ok got some good builds, although there's some outliers we've got a good set of data: https://gist.github.com/alexcrichton/4350118b27fa2ae356cfde3d21f6e45d The build with assertions had to rebuild some containers (namely the huge linux ones) so a few 6 hour builds are outliers. Relative times are relative to the first travis/appveyor column. Some things:
As for builders affected by this PR specifically:
It looks like assertions makes up the lion's share of time differences for these builders, except maybe the x86_64-gnu-nopt one which may benefit from splitting. The conclusions I'm drawing here are that we should disable LLVM/debug assertions by default on CI since we can't afford them and get such huge wins without them. We should still have a builder or two with them enabled, but perhaps don't run the full test suite and just a few things here and there. (also across platforms to cover OSX/Linux/MSVC at least). Additionally we probably don't want to start the splitting as proposed in this PR just yet. I think disabling assertions gets us almost the way there. The last major hurdle will be getting the dist builders under control, which I'll be taking a look into more shortly. |
☔ The latest upstream changes (presumably #61772) made this pull request unmergeable. Please resolve the merge conflicts. |
58025d1
to
199616f
Compare
Rebased, and also disabled Windows builders on auto and Windows and macOS builders on try. |
I talked with @pietroalbini on discord briefly about this, but I think we may not want to land this just yet. In our transition plan to Azure I think we're just going to eat a regression in compile times until the 4 core machines come online, and we're transitioning everything to Azure so speeding up some of the builders here won't translate to build time wins as long as at least one builder is lagging behind. I think the |
Yeah, closing this. |
This PR splits into two the 6 Linux builders that regularly approach or go over the 3 hours mark on Azure, using the
make ci-subset-{1,2}
already used by other builders on Windows.r? @alexcrichton
cc #61185