-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move all of our pipelines to use the global-build-job to build CoreCLR, Mono, and Libraries assets #99179
Conversation
…y properties from various AzDO scripts
…t our GCC leg uses the global build pipeline and our CI images automatically select the correct compiler)
…e and remove parameters that were only used in these scenarios
…llow variables here.
…obal build today are set in both places.
…(one product-build/test-build/test-run set of jobs) runtime test pipelines to use the global build approach.
…job + extra steps instead of using separate jobs.
…t. Unify all python usages to use the same python variable template (and all use venv as well)
…tes with helix execution in-job
@DrewScoggins the failures in the perf legs looks like restore failures in the benchmarks, not infrastructure issues. Can you take a look? |
I presume you are going to revert your change to jit.h once your testing is done? (It looks like that's causing a formatting job failure, also) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically impossible to review something this massive by sight, but I scanned through various changes, so LGTM.
Yep! Do you want me to keep the JIT-EE GUID change or should I revert that one before I merge this in? |
You should revert it. It was just to avoid trashing our collections when you were testing. |
/ba-g Timeouts in interpreter tests are known |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM I think. There might be more work to optimize things, but I think that can come later.
@jkoritzinsky thank you so much for doing this work! 🎉 |
The most recent commit in #99841 didn't run any lanes, is that potentially related to these yml changes? |
@kg it looks like it's running on AzDO but didn't post back to GitHub: https://dev.azure.com/dnceng-public/public/_build/results?buildId=605726&view=results |
@jkoritzinsky Looks like libraries-jitstress is broken: https://dev.azure.com/dnceng-public/public/_build/results?buildId=605612&view=results
|
@BruceForstall fix for that family of pipelines at #99868 |
Looks like https://dev.azure.com/dnceng-public/public/_build?definitionId=108&_a=summary
|
I've fixed that one as well in #99868 as I had other fixes I had to do there already. |
This seems to be a side-effect of dotnet#99179 Arcade checks whether `disableComponentGovernance` is the empty string to apply the default skipping logic: https://github.com/dotnet/arcade/blob/ace00d8719b8d1fdfd0cc05f71bb9af216338d27/eng/common/templates/job/job.yml#L168-L174 Changed our templates to make sure we pass that correctly.
This seems to be a side-effect of #99179 Arcade checks whether `disableComponentGovernance` is the empty string to apply the default skipping logic: https://github.com/dotnet/arcade/blob/ace00d8719b8d1fdfd0cc05f71bb9af216338d27/eng/common/templates/job/job.yml#L168-L174 Changed our templates to make sure we pass that correctly.
@jkoritzinsky The
This is a pipeline that depends on both the release and checked jitrollingbuild. I remember there was some special YML to do that; maybe it got lost? |
dotnet#99179 made some changes and switched the template the workloads build was using. In the process, workloadArtifactsPath and workloadPackagesPath got dropped, which meant the workloads build did not pick up any manifests to process. As a result, the build failed.
#99179 made some changes and switched the template the workloads build was using. In the process, workloadArtifactsPath and workloadPackagesPath got dropped, which meant the workloads build did not pick up any manifests to process. As a result, the build failed.
When we first consolidated dotnet/coreclr, dotnet/corefx, dotnet/core-setup, and mono/mono in .NET 5, we brought in separate product build, test build, and test run pipeline jobs, as before consolidation we were seeing increased build throughput and reliability by having separate jobs, ensuring we built each asset exactly once, and a complex matrix of how they interacted.
Over time, we've removed a few of these jobs and allowed double-building to reduce dependencies:
Libraries removed their separate "build test" job.
The CoreCLR/Mono "build test" job now builds the assets it needs directly instead of depending on other jobs.
The Libraries job now builds System.Private.CoreLib in-job (and now can even just build the ref-assembly for CoreLib)
Additionally, we've moved many jobs to build the product (and sometimes test) in one job using the "Global build" template:
All Mono devices jobs use the "global build" to build the product (and build runtime tests if needed)
All NativeAOT jobs use the "global build" template
Nearly all community-supported legs build using the "global build" template.
A few outerloop pipelines use the "global build" template for simplicity.
Today, our pipelines are split between using the "Global build" template that directly invokes the root build script to build the product in one command and the separate "build job" templates for building specific subsets (CoreCLR, Libraries, Mono, Installer) separately.
This has added significant complexity in many of our YAML scripts to support both cases and confusion about when to use one scenario or the other. For example, some jobs in the "runtime" pipeline use the global build job template to build the product and run tests whereas others don't.
Recently, we moved the official build to use the "global build" template for every job as a) we needed to move in this direction for the VMR and b) we believed that due to job queue time and upload/download time, we'd save on build execution time.
We saw decent improvements, as well as a significant reliability boost by merging these jobs together in the official build.
This PR takes us the next step: Merge all CoreCLR, Mono, and Libraries product build steps (and installer build steps when we aren't running tests) into single jobs instead of being split into 2 or 3 jobs (as all of the Mono runs also required some components of the corresponding CoreCLR build).
In addition, to avoid ending up in a split situation where many of our outerloop legs are still on the older system while PR and official builds are all on the new system, I have moved every pipeline to the "global build" model, not just the PR pipeline.
Job Dependency Graph Changes
CoreCLR + Libraries + Installer (w/o Installer Tests)
CoreCLR + Libraries + Installer w/ Tests
Mono Runtime Tests
YAML Template Flow Changes
In addition to the job simplifications above, there's also some YAML template simplification as well. In addition to removing the
build-coreclr-and-libraries-job
template, I consolidated the general flow of the JIT outerloop pipelines for CoreCLR tests into a single pipeline template, simplifying defining them.Additionally, I moved all outerloop pipelines that run libraries tests to use the global build job with the "Send to Helix" extra step instead of shuffling files between jobs.
Q&A
For validation, I'm happy to run any requested pipelines on this PR. I obviously can't run them all, but I'm happy to run a good spread to validate that I didn't break anything (I've been running various pipelines throughout working on this to validate the experience as I went).