-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0.7/1.0] Failed to catch SIGINT during stress.jl test #28580
Comments
I've discovered there are several global state related errors in the test cases that happen due to multiple tests being run in the same worker. The most reliable way to see them is to run all the test serially in a single worker. This happens automatically if you are building in a network less environment (such as the sandboxed build environment in NixOS). If this isn't your case, I expect you can force it in your test environment by changing cd(@__DIR__) do
n = 1
if net_on
n = min(Sys.CPU_THREADS, length(tests))
n > 1 && addprocs_with_testenv(n)
LinearAlgebra.BLAS.set_num_threads(1)
end I believe this particular error is caused by that fact that the spawn test of running an invalid command breaks the stress test for receiving SIGINT. You can directly test that these two are incompatible by running them back-to-back in a using Test;
# spawn.jl test of running an invalid command
@test_throws Base.IOError run(`foo_is_not_a_valid_command`)
# stress.jl test for receiving a SIGINT
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 0)
@test_throws InterruptException begin
ccall(:kill, Cvoid, (Cint, Cint,), getpid(), 2)
for i in 1:10
Libc.systemsleep(0.1)
ccall(:jl_gc_safepoint, Cvoid, ()) # wait for SIGINT to arrive
end
end
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 1) julia test.jl
|
Should add that I tried this using Julia 1.3.0 under both NixOS and gentoo. CCing @JeffBezanson as well as you seem to be in on the rest of these issues that running the test serially in a single worker has revealed. |
Some more data points. Under 1.3.0 it seems the SIGINT test just periodically fails on its own as well. That is, if it put using Test;
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 0)
@test_throws InterruptException begin
ccall(:kill, Cvoid, (Cint, Cint,), getpid(), 2)
for i in 1:10
Libc.systemsleep(0.1)
ccall(:jl_gc_safepoint, Cvoid, ()) # wait for SIGINT to arrive
end
end
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 1) in a ((n=0))
for ((i=0; i<1000; i++)); do
julia test.jl || ((n++))
done
echo "Failures: $n"
A friend tried 1.3.1 under arch linux and was not able to duplicate this, so it may be limited to a 1.3.0 issue. The original test, where you proceed it by trying to run an invalid command fails around 90-99% of the time under both 1.3.0 and 1.3.1 though using Test;
@test_throws Base.IOError run(`foo_is_not_a_valid_command`)
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 0)
@test_throws InterruptException begin
ccall(:kill, Cvoid, (Cint, Cint,), getpid(), 2)
for i in 1:10
Libc.systemsleep(0.1)
ccall(:jl_gc_safepoint, Cvoid, ()) # wait for SIGINT to arrive
end
end
ccall(:jl_exit_on_sigint, Cvoid, (Cint,), 1) ((n=0))
for ((i=0; i<1000; i++)); do
julia test.jl || ((n++))
done
echo "Failures: $n"
Hopefully having this (mostly) reproducible case will make it easier to track down what is going on. |
Should be fixed by #32599 on master |
http://debomatic-amd64.debian.net/distribution#unstable/julia/0.7.0-2/autopkgtest
Another build log:
https://buildd.debian.org/status/fetch.php?pkg=julia&arch=all&ver=0.7.0-1&stamp=1533838647&raw=0
Related issue #17706
The text was updated successfully, but these errors were encountered: