Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support if cache has a mixture of depot and non-depot includes #52346

Merged

Conversation

IanButterworth
Copy link
Member

AWS.jl 's @service macro is designed to be used within a package like

using AWS: @service
@service S3

which adds includes to files within another package.

If the package that macro is called in is dev-ed and not in the @depot during code load after precompilation it will falsely report that it's not a valid cache causing repeated precompilation. This fixes that issue.

@KristofferC
Copy link
Member

What's the concrete effect of this; does this turn off the relocatability of that AWS included file?

@IanButterworth
Copy link
Member Author

IanButterworth commented Nov 29, 2023

IIUC, no that file will be handled properly, it just allows a mixture of includes inside of @depot that are handled for relocatabilty properly, and outside of @depot that are not touched (and no debug warning)

Copy link
Member

@fatteneder fatteneder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Just cross-posting my comments from slack here:

The insertion of @depot during serialization is done 'blindly' on the C side. I.e. for each file we encounter we try to insert a @depot tag without worrying about where the other files from that package are located.
I chose to do the substitution during serialization in this way, because I was not comfortable to through errors on the C side in case we realize that not all files are from the same depot.
If the package does not ensure that all its files are all in one and the same depot, then atm this package is not relocatable, and so the debug message you see is formally correct.
So it would be the AWS.@service macro to be blamed here for whatever it does.The any and all checks that are 'inverted' in the 52346 PR were put in place in order to cure the problem that we do the serialization 'blindly' and we want to make sure that everything comes from the same place when loading.
However, back then I did not know understand the pkg slug. But if I understand correctly, then they actually make sure that one does not accidentally find files of a different version for a given cache for versioned pkgs.
(also confirmed by staticfloat's comment: #52161 (comment) think the change in 52346 PR is ok.
52346 PR just uses the depot for the first file for which a match is found.
Whether all files resolve to a file on disk is also checked by stalecache, so this PR actually removes a redundant computation.

@IanButterworth IanButterworth added the merge me PR is reviewed. Merge when all tests are passing label Dec 6, 2023
@IanButterworth IanButterworth merged commit 46ad1c1 into JuliaLang:master Dec 7, 2023
8 checks passed
@IanButterworth IanButterworth deleted the ib/relocate_outside_includes branch December 7, 2023 18:17
@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Dec 8, 2023
staticfloat pushed a commit that referenced this pull request Jan 8, 2024
…erent depots (#52750)

Fixes #52161

#52161 is an awkward use of the relocation feature in the sense that it
attempts to load the `include()` and `include_dependency` files of a pkg
from two separate depots.
The problem there is that the value with which we replace the `@depot`
tag for both `include()` and `include_dependency()` files is determined
by trying to relocate only the `include()` files. We then end up not
finding the `include_dependency()` files.

Solution:
@staticfloat noted that the pkg slugs in depot paths like
`@depot/packages/Foo/1a2b3c/src/Foo.jl` are already enough to (weakly)
content-address a depot.
This means that we should be able to load any `include()` file of a pkg
from any `depot` that contains a precompile cache, provided the hashes
match.
The same logic can be extended to `include_dependency()` files, which
this PR does.

Note that we continue to use only one file from the `include()` files to
determine the depot which we use to relocate all `include()` files.
[this behavior is kept from master]
But for `include_dependency()` files we allow each file to resolve to a
different depot. This way the MWE given in #52161 should be extendable
to two README files being located in two different pkgs that lie in two
different depots.

---

Side note: #49866 started with explicitly verifying that all `include()`
files come from the same depot. In #52346 this was already relaxed to
pick the first depot for which any `include()` file can be resolved to.
This works, because if any other file might be missing from that depot
then this is caught in `stalecache()`.
fatteneder added a commit that referenced this pull request Apr 2, 2024
…files (#53905)

Fixes
#53859 (comment),
which was actually fixed before in
#52346, but
#52750 undid that fix.

This PR does not directly address #53859, because I can not reproduce it
atm.

---

The `@depot` resolution logic for `include()` files is adjusted as
follows:

1. (new behavior) If the cache is not relocatable because of an absolute
path, we ignore that path for the depot search. Recompilation will be
triggered by `stale_cachefile()` if that absolute path does not exist.
Previously this caused any `@depot` tags to be not resolved and so
trigger recompilation.
2. (new behavior) If we can't find a depot for a relocatable path, we
still replace it with the depot we found from other files. Recompilation
will be triggered by `stale_cachefile()` because the resolved path does
not exist.
3. (this behavior is kept) We require that relocatable paths all resolve
to the same depot.
4. (new behavior) We no longer use the first matching depot for
replacement, but instead we explicitly check that all resolve to the
same depot. This has two reasons:
- We want to scan all source files anyways in order to provide logs for
1. and 2. above, so the check is free.
- It is possible that a depot might be missing source files. Assume that
we have two depots on `DEPOT_PATH`, `depot_complete` and
`depot_incomplete`.
If `DEPOT_PATH=["depot_complete","depot_incomplete"]` then no
recompilation shall happen, because `depot_complete` will be picked.
If `DEPOT_PATH=["depot_incomplete","depot_complete"]` we trigger
recompilation and hopefully a meaningful error about missing files is
thrown. If we were to just select the first depot we find, then whether
recompilation happens would depend on whether the first relocatable file
resolves to `depot_complete` or `depot_incomplete`.
KristofferC pushed a commit that referenced this pull request Apr 9, 2024
…files (#53905)

Fixes
#53859 (comment),
which was actually fixed before in
#52346, but
#52750 undid that fix.

This PR does not directly address #53859, because I can not reproduce it
atm.

---

The `@depot` resolution logic for `include()` files is adjusted as
follows:

1. (new behavior) If the cache is not relocatable because of an absolute
path, we ignore that path for the depot search. Recompilation will be
triggered by `stale_cachefile()` if that absolute path does not exist.
Previously this caused any `@depot` tags to be not resolved and so
trigger recompilation.
2. (new behavior) If we can't find a depot for a relocatable path, we
still replace it with the depot we found from other files. Recompilation
will be triggered by `stale_cachefile()` because the resolved path does
not exist.
3. (this behavior is kept) We require that relocatable paths all resolve
to the same depot.
4. (new behavior) We no longer use the first matching depot for
replacement, but instead we explicitly check that all resolve to the
same depot. This has two reasons:
- We want to scan all source files anyways in order to provide logs for
1. and 2. above, so the check is free.
- It is possible that a depot might be missing source files. Assume that
we have two depots on `DEPOT_PATH`, `depot_complete` and
`depot_incomplete`.
If `DEPOT_PATH=["depot_complete","depot_incomplete"]` then no
recompilation shall happen, because `depot_complete` will be picked.
If `DEPOT_PATH=["depot_incomplete","depot_complete"]` we trigger
recompilation and hopefully a meaningful error about missing files is
thrown. If we were to just select the first depot we find, then whether
recompilation happens would depend on whether the first relocatable file
resolves to `depot_complete` or `depot_incomplete`.

(cherry picked from commit d8d3842)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants