-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(pkg): Enable cache on fetch actions #10850
feat(pkg): Enable cache on fetch actions #10850
Conversation
b435efa
to
56a8b12
Compare
56a8b12
to
dc495fa
Compare
da2366e
to
8045539
Compare
src/dune_rules/fetch_rules.ml
Outdated
| Some (_, checksum) -> `Checksum checksum | ||
| None -> `Url url | ||
| Some (_, checksum) -> `Checksum checksum, Some true | ||
| None -> `Url url, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we rely on the default in this case? Either this is useful to cache or it's not.
I would say it's probably useful to cache since this step will de-duplicate the files being stored among all the workspaces. If you think this will consume too much space, then just set it to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about caching it via just the URL because there's nothing else preventing the URL from changing, so caching a source that keeps changing seems potentially incorrect. I wanted to let people override that with DUNE_CACHE_RULES=enabled
which enables all these potentially unsafe behaviors instead of introducing another config option.
For package management it doesn't matter since just about all packages have checksums.
If you think this is alright to cache in all cases, I will gladly switch it to always true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned about caching it via just the URL because there's nothing else preventing the URL from changing, so caching a source that keeps changing seems potentially incorrect
Yeah, that's an issue. Why is this related to the shared cache though? Is it because you want dune clean
to re-download?
I wanted to let people override that with DUNE_CACHE_RULES=enabled which enables all these potentially unsafe behaviors instead of introducing another config option.
The DUNE_CACHE_RULES
is mostly there for backwards compat. I don't think we should rely on it for future rules. When rules are useful and safe to cache we should do it, otherwise, we disable it.
If you think this is alright to cache in all cases, I will gladly switch it to always true.
As mentioned earlier, I think it has the potential to save users a bunch of space if they are expanding the same source that comes from a git revision in multiple workspaces.
I think using either true
or false
is fine, but I don't like letting the user fiddle with DUNE_CACHE_RULES
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's an issue. Why is this related to the shared cache though?
My understanding is that if the URL changes upstream we would continue to use the old data from the shared cache with no easy way to even invalidate that particular URL. This happens quite a lot when people depend on Github tarballs which get regenerated in in certain cases.
The
DUNE_CACHE_RULES
is mostly there for backwards compat. I don't think we should rely on it for future rules. When rules are useful and safe to cache we should do it, otherwise, we disable it.
I see, yes. That makes sense.
This enables caching of downloaded files as part of the fetch action, if the file to be retrieved is retrieved via a cryptographic hash (and thus presumably the exact same as the first time it was downloaded). Signed-off-by: Marek Kubica <marek@tarides.com>
5b24798
to
59ed4a7
Compare
This enables caching of downloaded files as part of the fetch action, if the file to be retrieved is retrieved via a cryptographic hash (and thus presumably the exact same as the first time it was downloaded). Signed-off-by: Marek Kubica <marek@tarides.com>
This enables caching of downloaded files as part of the fetch action, if the file to be retrieved is retrieved via a cryptographic hash (and thus presumably the exact same as the first time it was downloaded).
@rgrinberg I've added a new function to the engine to force-set the cache flag, as
add_can_go_in_shared_cache
can only be flipped tofalse
but never back totrue
. It's not used at the moment, so another option would be to remove that&&
but just in case I made it a separate function.