-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry artifact rename if it fails #4001
Retry artifact rename if it fails #4001
Conversation
if retry < max_num_retries && err ∈ (Base.UV_EACCES, Base.UV_EPERM, Base.UV_EBUSY) | ||
sleep(sleep_amount) | ||
sleep_amount = min(sleep_amount*2.0, max_sleep_amount) | ||
retry += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this can take up to 60s, I think we should print something out to the user. Something like:
# Warn once, and only if we've slept a non-trivial amount of time already
if retry == 8
@warn("Artifact installation failed, retrying for up to 60s. Check to see if an anti-virus may be interfering.")
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is hit during the new parallel artifact download process, then outputting a log will mess with the multiprogress printing (sorry). We'll need some way to keep that tidy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the location works but in the default happy path I wouldn't show any further detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So only show a status message if an artifact is stuck at a stage for over 1 second?
To do the actual printing, can I add a status::String
field to
Pkg.jl/src/MiniProgressBars.jl
Line 19 in 8b2c0f3
Base.@kwdef mutable struct MiniProgressBar |
that gets appended to progress_text
in?
Pkg.jl/src/MiniProgressBars.jl
Line 63 in 8b2c0f3
progress_text = if p.mode == :percentage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good but I'd do it such that if status is non empty just show the status string after the name and not the progress bar or ending ratio. Just so that we're more likely to fit in the terminal width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to backport this to 1.10 & 1.11, but when we do we'll want to add a log message about it taking a long time, as suggested above, given the parallel download thing isn't going to be backported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to the time it already takes to download and unpack artifacts on Windows with an aggressive anti-virus I don't think one minute extra is something to create a warning about, as long as the artifact eventually gets installed.
As this is currently, the status that will print will explain the step that's taking a long time, but won't mention that it could be the antivirus causing it. Should that be changed? Might be annoying to see the same long explanation multiple times as I assume the issue will affect multiple parallel install tasks? |
I was able to test this PR using julia> Pkg.pkg"add x265_jll"
Resolving package versions...
Installing artifacts ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/1
ERROR: Failed to install some artifacts:
Unable to automatically download/install artifact 'x265' from sources listed in '/home/nathan/.julia/packages/x265_jll/70izA/Artifacts.toml'.
Sources attempted:
- https://pkg.julialang.org/artifact/00023f2f62a04af7531278ce1a689ff93800ffa1
Error: IOError: rename of "/home/nathan/.julia/artifacts/jl_Aly3ZD" to "/home/nathan/.julia/artifacts/00023f2f62a04af7531278ce1a689ff93800ffa1": resource busy or locked (EBUSY)
- https://github.com/JuliaBinaryWrappers/x265_jll.jl/releases/download/x265-v3.6.0+0/x265.v3.6.0.x86_64-linux-gnu.tar.gz
Error: IOError: rename of "/home/nathan/.julia/artifacts/jl_p180ux" to "/home/nathan/.julia/artifacts/00023f2f62a04af7531278ce1a689ff93800ffa1": resource busy or locked (EBUSY)
Stacktrace:
[1] pkgerror(msg::String)
@ Pkg.Types ~/github/Pkg.jl/src/Types.jl:68
[2] download_artifacts(ctx::Pkg.Types.Context; platform::Base.BinaryPlatforms.Platform, julia_version::VersionNumber, verbose::Bool)
@ Pkg.Operations ~/github/Pkg.jl/src/Operations.jl:978
[3] download_artifacts
@ ~/github/Pkg.jl/src/Operations.jl:827 [inlined]
[4] add(ctx::Pkg.Types.Context, pkgs::Vector{…}, new_git::Set{…}; allow_autoprecomp::Bool, preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, target::Symbol)
@ Pkg.Operations ~/github/Pkg.jl/src/Operations.jl:1656
[5] add
@ ~/github/Pkg.jl/src/Operations.jl:1621 [inlined]
[6] add(ctx::Pkg.Types.Context, pkgs::Vector{…}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, target::Symbol, allow_autoprecomp::Bool, kwargs::@Kwargs{…})
@ Pkg.API ~/github/Pkg.jl/src/API.jl:306
[7] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{IO}, kwargs::@Kwargs{})
@ Pkg.API ~/github/Pkg.jl/src/API.jl:159
[8] add(pkgs::Vector{Pkg.Types.PackageSpec})
@ Pkg.API ~/github/Pkg.jl/src/API.jl:148
[9] do_cmd(command::Pkg.REPLMode.Command, io::IOContext{IO})
@ Pkg.REPLMode ~/github/Pkg.jl/src/REPLMode/REPLMode.jl:407
[10] do_cmds(commands::Vector{Pkg.REPLMode.Command}, io::IOContext{IO})
@ Pkg.REPLMode ~/github/Pkg.jl/src/REPLMode/REPLMode.jl:393
[11] do_cmds(input::String, io::IOContext{IO})
@ Pkg.REPLMode ~/github/Pkg.jl/src/REPLMode/REPLMode.jl:383
[12] pkgstr(str::String)
@ Pkg.REPLMode ~/github/Pkg.jl/src/REPLMode/REPLMode.jl:450
[13] top-level scope
@ REPL[3]:1
Some type information was truncated. Use `show(err)` to see complete types. |
Great. Seems like a big improvement. A thought for a follow on PR would be adding something to the docs about why that could happen. |
(cherry picked from commit bc51db7)
(cherry picked from commit bc51db7)
Hopefully fixes #3822
This uses a strategy from https://github.com/isaacs/node-graceful-fs/blob/234379906b7d2f4c9cfeb412d2516f42b0fb4953/polyfills.js#L87-L95 to retry the rename for about a minute to work around anti-virus software temporarily locking a directory with newly created files. This strategy is also used in vscode, Ref: microsoft/vscode#188899
I am not restricting this fix to only Windows, because similar issues can happen on Linux, Ref: isaacs/node-graceful-fs#133