Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pipeline caching feature of AzDo in CI #59598

Open
kunalspathak opened this issue Sep 25, 2021 · 7 comments
Open

Use pipeline caching feature of AzDo in CI #59598

kunalspathak opened this issue Sep 25, 2021 · 7 comments
Labels
area-Infrastructure design-discussion Ongoing discussion about design without consensus
Milestone

Comments

@kunalspathak
Copy link
Member

Pipeline caching sounds to be a powerful feature which lets us cache the artifacts that are built and reuse them in other runs provided certain criteria match. We could use this feature for at least few purposes that I can think of:

  • If we trigger multiple pipeline runs (e.g. jitstress, jitstressregs runtime) on same commit, one pipeline can perform all the needed builds and cache it against the commit. Other pipelines can then simply download the already built artifacts, saving lot of time in machine time and freeing up resources that does same build over and over.
  • We could also cache the nuget packages that we download against some logical key (index, library hash, nuget package versions, dotnet version, etc.) and then use it during nuget restore. The download IMO will be faster because contents will be downloaded from internal azure storage rather than internet.
  • Cache the libraries built against a commit that last touched the libraries folder. Someone who is working on CoreClr changes need to do CI run, they won't have to build the libraries because we already have it cached, it can be simply downloaded and use. Vice-versa for caching coreclr builds.
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Sep 25, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak
Copy link
Member Author

@jkotas

@ghost
Copy link

ghost commented Sep 25, 2021

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

Pipeline caching sounds to be a powerful feature which lets us cache the artifacts that are built and reuse them in other runs provided certain criteria match. We could use this feature for at least few purposes that I can think of:

  • If we trigger multiple pipeline runs (e.g. jitstress, jitstressregs runtime) on same commit, one pipeline can perform all the needed builds and cache it against the commit. Other pipelines can then simply download the already built artifacts, saving lot of time in machine time and freeing up resources that does same build over and over.
  • We could also cache the nuget packages that we download against some logical key (index, library hash, nuget package versions, dotnet version, etc.) and then use it during nuget restore. The download IMO will be faster because contents will be downloaded from internal azure storage rather than internet.
  • Cache the libraries built against a commit that last touched the libraries folder. Someone who is working on CoreClr changes need to do CI run, they won't have to build the libraries because we already have it cached, it can be simply downloaded and use. Vice-versa for caching coreclr builds.
Author: kunalspathak
Assignees: -
Labels:

area-Infrastructure, untriaged

Milestone: -

@hoyosjs
Copy link
Member

hoyosjs commented Sep 25, 2021

This is an interesting idea overall.

  • Cache the libraries built against a commit that last touched the libraries folder. Someone who is working on CoreClr changes need to do CI run, they won't have to build the libraries because we already have it cached, it can be simply downloaded and use. Vice-versa for caching coreclr builds.

This one is hard to get right. I'd say use this for PRs at most - let CI be a sanity test and official build should never touch the cache. For this one you'd need to consider: the eng folder (easier than exactly versions.props and eng/common), the targets/props files in the root, and global.json. Also, there are some portions of SPC that are in the coreclr. I don't think there's something else that ties them (I don't know for mono).

Also, maybe this is something we should consider for an experiment: I wonder if outside a PR's builds this will have a big effect - as the commit rate to each folder is non-trivial on main, invalidating the proposed cache policies often. However, within a PR this could have big savings - saves a lot of build overlaps between pipelines and restores in iterations of a given PR. That's why I think bullets1 and 2 might represent big wins.

@steveisok
Copy link
Member

/cc @akoeplinger @directhex

@safern
Copy link
Member

safern commented Sep 27, 2021

Also, maybe this is something we should consider for an experiment: I wonder if outside a PR's builds this will have a big effect - as the commit rate to each folder is non-trivial on main, invalidating the proposed cache policies often. However, within a PR this could have big savings - saves a lot of build overlaps between pipelines and restores in iterations of a given PR. That's why I think bullets1 and 2 might represent big wins.

Maybe we could start with caching the dotnet SDK we use (since it mostly changes once a month). I think that would be helpful, not a lot timewise but it would be helpful on reliability as we hit a lot of errors when downloading the SDK?

@PathogenDavid
Copy link
Contributor

  • Cache the libraries built against a commit that last touched the libraries folder. Someone who is working on CoreClr changes need to do CI run, they won't have to build the libraries because we already have it cached, it can be simply downloaded and use. Vice-versa for caching coreclr builds.

As @hoyosjs noted this one is hard to get right. A nice compromise I've had a lot of success with is utilizing Mozilla sccache for my builds and caching its database between runs. It took my CI builds of libclang down from 2.5-3 hours to 15-20 minutes when there's little/no actual changes.

(sccache is pretty simple to use, but here's my workflow if you want to see it used in practice. It's GitHub Actions instead of Azure DevOps, but in my experience they're pretty similar since one's a fork of the other. build-native.cmd/sh is what configures its use with CMake.)

@ericstj ericstj added design-discussion Ongoing discussion about design without consensus and removed untriaged New issue has not been triaged by the area owner labels Sep 29, 2021
@ericstj ericstj added this to the 7.0.0 milestone Sep 29, 2021
@hoyosjs hoyosjs modified the milestones: 7.0.0, 8.0.0 Aug 10, 2022
@agocke agocke modified the milestones: 8.0.0, Future Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Infrastructure design-discussion Ongoing discussion about design without consensus
Projects
Status: No status
Development

No branches or pull requests

7 participants