Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid loading pkg until needed #1

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

IanButterworth
Copy link
Member

@IanButterworth IanButterworth commented Sep 10, 2024

Fixes JuliaLang/julia#55725

This PR

julia> @time using MicroMamba
  0.195992 seconds (343.78 k allocations: 18.925 MiB, 19.03% gc time, 78.26% compilation time: 99% of which was recompilation)
julia> @time @time_imports using MicroMamba
      0.6 ms  Printf
     15.5 ms  Dates
      0.4 ms  Scratch
     10.9 ms  LazyArtifacts
      0.6 ms  TOML
      7.8 ms  Preferences
      0.4 ms  JLLWrappers
               ┌ 120.3 ms micromamba_jll.__init__() 99.79% compilation time (98% recompilation)
    222.6 ms  micromamba_jll 99.59% compilation time (99% recompilation)
      0.4 ms  MicroMamba
  0.271243 seconds (808.59 k allocations: 43.782 MiB, 14.78% gc time, 82.66% compilation time: 99% of which was recompilation)

julia master

julia> @time using MicroMamba
  0.355640 seconds (610.25 k allocations: 39.037 MiB, 13.60% gc time, 14.79% compilation time: 100% of which was recompilation)
julia> @time @time_imports using MicroMamba
      0.5 ms  Printf
     11.7 ms  Dates
      0.3 ms  Scratch
      0.3 ms  TOML
               ┌ 0.0 ms NetworkOptions.__init__()
      1.0 ms  NetworkOptions
               ┌ 0.8 ms MbedTLS_jll.__init__()
      2.6 ms  MbedTLS_jll
               ┌ 0.3 ms LibSSH2_jll.__init__()
      2.0 ms  LibSSH2_jll
               ┌ 0.6 ms LibGit2_jll.__init__()
      2.1 ms  LibGit2_jll
      6.9 ms  LibGit2
      5.5 ms  ArgTools
               ┌ 0.3 ms nghttp2_jll.__init__()
      1.8 ms  nghttp2_jll
               ┌ 0.6 ms LibCURL_jll.__init__()
      2.3 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__()
      1.6 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__()
      0.9 ms  LibCURL
               ┌ 2.9 ms Downloads.Curl.__init__()
     15.4 ms  Downloads
      0.7 ms  Tar
               ┌ 0.0 ms p7zip_jll.__init__()
      2.1 ms  p7zip_jll
      0.3 ms  UUIDs
      0.2 ms  Logging
               ┌ 0.0 ms Pkg.__init__()
    221.1 ms  Pkg
      0.5 ms  LazyArtifacts
      7.3 ms  Preferences
      0.5 ms  JLLWrappers
               ┌ 50.3 ms micromamba_jll.__init__() 99.61% compilation time (100% recompilation)
    126.1 ms  micromamba_jll 99.23% compilation time (75% recompilation)
      0.5 ms  MicroMamba
  0.447403 seconds (1.07 M allocations: 63.718 MiB, 15.97% gc time, 28.90% compilation time: 76% of which was recompilation)

@giordano
Copy link
Member

Should delete

Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
too

@IanButterworth
Copy link
Member Author

I'm not sure about that. I think we need the Project dependency even if we're not always loading.

@giordano
Copy link
Member

Ah, is that because of

Pkg = Base.require_stdlib(Base.PkgId(Base.UUID("44cfe95a-1eb2-52ea-b672-e2afdf69b78f"), "Pkg"));
?

if !artifact_exists(hash)
# loading Pkg is a bit slow, so we only do it if we need to
# and do it in a subprocess to avoid precompilation complexity
cmd = run(pipeline(`$(Base.julia_cmd()) -E '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of subbing out to another process vs. doing something like Base.inference_barrier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash would that work? It'd be nice to not have to manage the download progress bar behavior coming from the sub process, while being able to read back the resulting path at the end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot unload it without a process, so it is illegal to load during precompile

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see; yes, if this is run during precompilation, loading another module will screw up the precompilation output, so that makes sense.

src/LazyArtifacts.jl Outdated Show resolved Hide resolved
@IanButterworth
Copy link
Member Author

I build julia with this locally and confirmed that the progress bar shows properly with overprinting etc.

julia> using PythonCall
  Downloaded artifact: micromamba

julia>

@IanButterworth

This comment was marked as resolved.

@IanButterworth IanButterworth changed the title avoid loading pkg until we have to avoid loading pkg until needed Sep 11, 2024
Comment on lines 57 to 67
Pkg = Base.require_stdlib(Base.PkgId(Base.UUID("44cfe95a-1eb2-52ea-b672-e2afdf69b78f"), "Pkg"));
Pkg.Artifacts.try_artifact_download_sources(
$(repr(name)),
Base.$(repr(hash)),
$(repr(meta)),
$(repr(artifacts_toml));
platform = Base.BinaryPlatforms.$(repr(platform)),
verbose = $(repr(verbose)),
quiet_download = $(repr(quiet_download)),
io = stderr
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, unfortunately running a sub-process with dynamic code like this is very incompatible with --trim

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1 (comment) for some discussion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to do this only during pre-compilation? That behavior on its own is fine, but doing it at runtime is problematic for --trim

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to do this only during pre-compilation?

If we revert JuliaPackaging/JLLWrappers.jl#66 yes 😛

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really agree that lazy artifacts are incompatible with AOT compilation - but they are definitely incompatible with the idea of providing a self-contained executable/package

My request was for this to be branched on Base.generating_output() because the true branch is not inferable/compilable for juliac, but we have ways to let juliac know that only the false branch is ever taken at run-time.

In that case, you're still running an AOT-compiled executable and the download works - it just happens lazily.

Copy link
Member Author

@IanButterworth IanButterworth Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Didn't see Cody's reply before sending this)
I think the question here is more about legality, rather than preference that something orchestrating --trim would have for artifact downloads a la PackageCompiler.

As it is now seems more generally legal.

Copy link

@topolarity topolarity Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I fear that Base.require_stdlib is still going to be effectively impossible for us to infer

We're not even allowed to load the stdlib for inference on its own functions, because that would break the same pre-compilation assumptions we're skirting in the other branch.

I'm a bit worried that this change just makes LazyArtifacts incompatible with --trim unfortunately

We might have to find a way to load Pkg eagerly for --trim alone (which unfortunately means that AOT-compiled applications don't get this speed up)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you can infer it directly, but you should be able to pre-emulate it, since it is just a shim for finding the right .ji file to load, and that can be pre-computed and run directly (esp. for --trim)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but then I need to infer Pkg.Artifacts.try_artifact_download_sources and also ensure that its code is available in the loaded pkgimage (or our own sysimage)

@topolarity
Copy link

How did we get into this situation where LazyArtifacts depends on Pkg for this implementation, instead of vice-versa?

I'm probably missing some historical context..

But it feels like this would maybe better be solved by getting the code out of Pkg?

@IanButterworth
Copy link
Member Author

Others here may be able to correct/expand.. but:

Because Pkg used to be in the sysimage so was fast, and downloading artifacts is handled by Pkg at Pkg.add etc. time, including hooking into the pkgserver setup because artifacts hosted by pkgserver-ed packages are attempted to be downloaded from the pkgserver before any declared url.

i.e.
The Artifacts stdlib doesn't do downloading and has no deps, just points to things that were installed by Pkg.
Pkg.Artifacts does downloading during Pkg.add etc.
LazyArtifacts doesn't do downloading, but needs to orchestrate it, so depends on Pkg.


I think if we simply wanted to merge Pkg.Artifacts into Artifacts it would have to take on Downloads ,Tar, TOML, LibGit2 and also take with it the Pkg.GitTools module.

Artifacts currently has no deps.

@staticfloat
Copy link
Member

I'm probably missing some historical context..

Originally, everything having to do with packages was in Pkg. There was no Artifacts module, and both lazy and non-lazy artifacts were all just functionality in Pkg. Eventually, we split things out into separate modules, and we maintained the functionality within Pkg for backwards-compatibility.

@IanButterworth
Copy link
Member Author

Before merge I should add a test for the fork done during precompilation.

@topolarity
Copy link

I did some experimenting - Ripping out the relevant bits doesn't look too bad: JuliaLang/julia#55755

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

136x regression for LazyArtifacts
6 participants