-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework pacman pkg cache and enable by default #51
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this cache doesn't save much time, since installation and initial setup
take the most time, but this should save a lot of traffic for our main
repo server.
I believe this was the real motivation to use caching, since the begining. The time reduction is not very meaningful for users, but the external traffic difference is significant.
And also this removes the cache option which was confusing some users because it
only caches packages and not everything.
I am concerned about how to handle this in the release. I wouldn't like to remove option cache
now (which would be a breaking change) if we expect to add it again in the future.
Apart from that, if I understand correctly, this caching procedure is similar to the one we have now (only packages are cached, not any installation or home directory). That is, users should find no difference between a clean install or using this automated cache; the resulting packages are the same. What would happen if a job with update: true
saves a cache and that is later retrieved by a job with update: false
? Is that an unsupported use case?
👍
I've left the option in the test workflows which just leads to
Exactly.
Good point. Assuming "foobar-1.1" gets installed, then cached, then the second job would restore "foobar-1.1", install "foobar-1.0" which gets downloaded and added to the cache and then pruning would remove "foobar-1.0" again because it is older. So in a sense caching would be broken then. This could be worked around by keeping 2 versions of each package around when pruning because there are only two versions reachable, one without update and the newest one, but not sure if that's worth it. |
If you are unsure about the option being useful in the future, we can remove it, release v2.0.0 today and commit ourselves not to touch caching again in a while (unless some very obvious/drastic change happens in the ecosystem).
I would also be ok with removing option |
I don't see this change breaking any existing uses, but I also don't mind a v2.0.0 release if you'd prefer one (also keeping the previous shell changes in mind)
(short interjection, I could also just add "update" to the cache restore fallback key so it doesn't get reused in a non-update job, I didn't think of this..) I'll fix this tomorrow.
Hmmm, tricky. From the POV of a user I would want stability despite it being rolling, so if the repo is in a good state I'd want no update. If things in the repo happened to be broken when the installer was updated I'd want to update. (also I wouldn't want long updates for the system MSYS2 version). I kinda tend towards what we have now and keep things as is. |
Yes, I was thinking that we have several non-breaking but significant changes. But, you are correct that we can also bump to v1.2.0 if we are going to keep
Nice!
Gotcha. |
Three main improvements: * Create a separate cache for each input configuration: Assuming a build matrix which builds for different arches and has different dependencies we create a cache for each of them, while loading any of them. * Prune caches before saving them: Every time a package has a new version we create a new cache which only includes the new packages. This makes sure that the cache/restore size stays as small as possible over time. * Avoid cache race conditions: If a run doesn't change the cache we wont save it and if saving fails because another job created it in the mean time, we catch the error and ignore it. Overall this cache doesn't save much time, since installation and initial setup take the most time, but this should save a lot of traffic for our main repo server. And also this removes the cache option which was confusing some users because it only caches packages and not everything.
Added this now. In theory we could now run all CI jobs in parallel. edit: added that as a second commit, feedback welcome |
Start with one which installs some packages so we have some cache reuse later on. Then start all the rest.
Execution time of the workflow was reduced from 20min to 7min. That's nice! I hope we won't have any conflicts unless GHA is used intensively in other repos too. |
Three main improvements:
which builds for different arches and has different dependencies we create
a cache for each of them, while loading any of them.
create a new cache which only includes the new packages. This makes sure that
the cache/restore size stays as small as possible over time.
and if saving fails because another job created it in the mean time,
we catch the error and ignore it.
Overall this cache doesn't save much time, since installation and initial setup
take the most time, but this should save a lot of traffic for our main
repo server.
And also this removes the cache option which was confusing some users because it
only caches packages and not everything.