-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: refactor the code loading of packages #46690
Open
KristofferC
wants to merge
7
commits into
master
Choose a base branch
from
kc/codeloading2.0
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This will replace the https://github.com/JuliaLang/julia/compare/jn/loading-plus branch (8ca3f3b), right? |
Yes, that's the idea. |
KristofferC
force-pushed
the
kc/codeloading2.0
branch
2 times, most recently
from
September 9, 2022 14:16
d09bea7
to
5c17075
Compare
Awesome. I will delete that then now |
DilumAluthge
added
the
needs pkgeval
Tests for all registered packages should be run with this change
label
Sep 9, 2022
KristofferC
force-pushed
the
kc/codeloading2.0
branch
4 times, most recently
from
September 12, 2022 13:47
8ab209a
to
2eebacf
Compare
KristofferC
force-pushed
the
kc/codeloading2.0
branch
from
September 15, 2022 11:11
4f4c65a
to
6e0b2da
Compare
KristofferC
force-pushed
the
kc/codeloading2.0
branch
2 times, most recently
from
September 15, 2022 15:48
5a9dc91
to
318c891
Compare
KristofferC
force-pushed
the
kc/codeloading2.0
branch
from
November 9, 2022 15:00
318c891
to
47c3edf
Compare
IanButterworth
added a commit
that referenced
this pull request
Mar 5, 2024
Parallel precompilation is more or less now required in order to use somewhat large packages unless you want to wait an obscene long time for it to complete. Right now, we even start a parallel precompilation on a package load if we notice that the package you are loading is not precompiled. This functionally has typically been implemented in Pkg but with Pkg not being in the sysimage it becomes a bit awkward because we then need to load Pkg from Base. The only real reason this functionality has been implemented in Pkg is that Pkg has some useful features for parsing environments. Moving precompilation to Base has typically been stalled on such an environment parser not existing in Base. However, in #46690 I started implemented code loading on top of a more up front environment parser (instead of the "incremental" one that currently exists in `loading.jl`) and we can retro fit this to be used as the basis of parallel precompilation. At some later point code loading could be implemented on top of it but that is for now considered future work. This PR thus adds the environment parser from the codeloading PR and implementes the parallel precompilation feature from Pkg on top of it (instead of on top of the `EnvCache` in Pkg). Some points to bring up here: - This copy pastes the progress bar implementation in Pkg into here. It is probably a bit excessive to use so we can simplify that significantly. - Parallel precompilation uses the `FileWatching` module to avoid different processes trying to precompile the same package concurrently. Right now, I used grab this from `Base.loaded_modules` relying on it being in the sysimage. - This removes the "suspended" functionality from the Pkg precompilation which does not try to precompile packages if they have "recently" failed which is unclear how useful it is in practice. This also requires the Serialization stdlib and uses data structures defined in Pkg so it is hard to keep when moving this to Base. --------- Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>
KristofferC
added a commit
that referenced
this pull request
Mar 6, 2024
Parallel precompilation is more or less now required in order to use somewhat large packages unless you want to wait an obscene long time for it to complete. Right now, we even start a parallel precompilation on a package load if we notice that the package you are loading is not precompiled. This functionally has typically been implemented in Pkg but with Pkg not being in the sysimage it becomes a bit awkward because we then need to load Pkg from Base. The only real reason this functionality has been implemented in Pkg is that Pkg has some useful features for parsing environments. Moving precompilation to Base has typically been stalled on such an environment parser not existing in Base. However, in #46690 I started implemented code loading on top of a more up front environment parser (instead of the "incremental" one that currently exists in `loading.jl`) and we can retro fit this to be used as the basis of parallel precompilation. At some later point code loading could be implemented on top of it but that is for now considered future work. This PR thus adds the environment parser from the codeloading PR and implementes the parallel precompilation feature from Pkg on top of it (instead of on top of the `EnvCache` in Pkg). Some points to bring up here: - This copy pastes the progress bar implementation in Pkg into here. It is probably a bit excessive to use so we can simplify that significantly. - Parallel precompilation uses the `FileWatching` module to avoid different processes trying to precompile the same package concurrently. Right now, I used grab this from `Base.loaded_modules` relying on it being in the sysimage. - This removes the "suspended" functionality from the Pkg precompilation which does not try to precompile packages if they have "recently" failed which is unclear how useful it is in practice. This also requires the Serialization stdlib and uses data structures defined in Pkg so it is hard to keep when moving this to Base. --------- Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com> (cherry picked from commit 6745160)
mkitti
pushed a commit
to mkitti/julia
that referenced
this pull request
Apr 13, 2024
Parallel precompilation is more or less now required in order to use somewhat large packages unless you want to wait an obscene long time for it to complete. Right now, we even start a parallel precompilation on a package load if we notice that the package you are loading is not precompiled. This functionally has typically been implemented in Pkg but with Pkg not being in the sysimage it becomes a bit awkward because we then need to load Pkg from Base. The only real reason this functionality has been implemented in Pkg is that Pkg has some useful features for parsing environments. Moving precompilation to Base has typically been stalled on such an environment parser not existing in Base. However, in JuliaLang#46690 I started implemented code loading on top of a more up front environment parser (instead of the "incremental" one that currently exists in `loading.jl`) and we can retro fit this to be used as the basis of parallel precompilation. At some later point code loading could be implemented on top of it but that is for now considered future work. This PR thus adds the environment parser from the codeloading PR and implementes the parallel precompilation feature from Pkg on top of it (instead of on top of the `EnvCache` in Pkg). Some points to bring up here: - This copy pastes the progress bar implementation in Pkg into here. It is probably a bit excessive to use so we can simplify that significantly. - Parallel precompilation uses the `FileWatching` module to avoid different processes trying to precompile the same package concurrently. Right now, I used grab this from `Base.loaded_modules` relying on it being in the sysimage. - This removes the "suspended" functionality from the Pkg precompilation which does not try to precompile packages if they have "recently" failed which is unclear how useful it is in practice. This also requires the Serialization stdlib and uses data structures defined in Pkg so it is hard to keep when moving this to Base. --------- Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
needs pkgeval
Tests for all registered packages should be run with this change
packages
Package management and loading
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current code for code loading is quite stateless (although there is currently a cache for parsed TOML files). Every time information about a package is looked up, the code loading system starts from scratch by searching through the load path, reading through manifests looking for the piece of information it needs. Touching disk a lot was a performance problem that is now mostly worked around in #40890 but the solution in there is quite ugly and ad hoc.
This PR explores another approach where the full information of an environment is parsed and stored in an object (which then makes lookups trivial). This is a bigger initial cost but the idea is that environments change rarely enough that this can be cached over multiple calls into code loading (as well as getting passed to precompile workers) which will save work in the long run.
This PR so far implements the new lookup strategy but it still needs the caching part where the environment stack is persisted over several calls to
require
as well as getting passed to precompile workers. This will also need a PkgEval run since there might be quite a few packages using the internals of code loading (maybe Revise for example).For ease of reading, the new stuff is in a separate file called
codeloading2.jl
but this will be folded intoloading.jl
when this is done.