Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in 1.10 while loading a package, possibly related to @recompile_invalidations #52660

Closed
orebas opened this issue Dec 29, 2023 · 8 comments · Fixed by #52752
Closed
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@orebas
Copy link

orebas commented Dec 29, 2023

The development version of ParameterEstimation.jl causes a segfault in Julia 1.10, but not 1.9.4.
MWE:

using Pkg
Pkg.develop("ParameterEstimation")
Pkg.resolve()
Pkg.precompile()
using SIAN, TaylorDiff, StructuralIdentifiability
using Oscar,ArbNumerics,ModelingToolkit, Polymake
using Hecke, LinearSolve, HomotopyContinuation, BaryRational
using DifferentialEquations,Groebner,TaylorSeries, Suppressor
using ParameterEstimation

The final line causes a segfault. Here is the output:

[13139] signal (11.1): Segmentation fault
in expression starting at REPL[9]:1
jl_to_typeof at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:766 [inlined]
must_be_new_dt at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata_utils.c:18
must_be_new_dt at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata_utils.c:65
jl_restore_system_image_from_stream_ at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata.c:3070
jl_restore_package_image_from_stream at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata.c:3418
jl_restore_incremental_from_buf at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata.c:3465
ijl_restore_package_image_from_file at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/staticdata.c:3549_include_from_serialized at ./loading.jl:1052
_require_search_from_serialized at ./loading.jl:1575
_require at ./loading.jl:1932
__require_prelocked at ./loading.jl:1806
jfptr___require_prelocked_80742.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
jl_f__call_in_world at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/builtins.c:831
#invoke_in_world#3 at ./essentials.jl:921 [inlined]
invoke_in_world at ./essentials.jl:918 [inlined]
_require_prelocked at ./loading.jl:1797
macro expansion at ./loading.jl:1784 [inlined]
macro expansion at ./lock.jl:267 [inlined]
__require at ./loading.jl:1747
jfptr___require_80707.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
jl_f__call_in_world at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/builtins.c:831
#invoke_in_world#3 at ./essentials.jl:921 [inlined]
invoke_in_world at ./essentials.jl:918 [inlined]
require at ./loading.jl:1740
jfptr_require_80704.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
call_require at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:481 [inlined]
eval_import_path at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:518
jl_toplevel_eval_flex at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:752
jl_toplevel_eval_flex at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:877
jl_toplevel_eval_flex at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:877
jl_toplevel_eval_flex at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:877
ijl_toplevel_eval_in at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/toplevel.c:985
eval at ./boot.jl:385 [inlined]
eval_user_input at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150
repl_backend_loop at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246
#start_repl_backend#46 at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231
start_repl_backend at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
#run_repl#59 at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389
run_repl at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375
jfptr_run_repl_91689.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
#1013 at ./client.jl:432
jfptr_YY.1013_82677.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
jl_f__call_latest at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/builtins.c:812
#invokelatest#2 at ./essentials.jl:887 [inlined]
invokelatest at ./essentials.jl:884 [inlined]
run_main_repl at ./client.jl:416
exec_options at ./client.jl:333
_start at ./client.jl:552
jfptr__start_82703.1 at /home/orebas/.julia/juliaup/julia-1.10.0+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:2894 [inlined]
ijl_apply_generic at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
true_main at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/jlapi.c:582
jl_repl_entrypoint at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/src/jlapi.c:731
main at /cache/build/builder-amdci4-6/julialang/julia-release-1-dot-10/cli/loader_exe.c:58
unknown function (ip: 0x7f315f3e3d8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 42708717 (Pool: 42639886; Big: 68831); GC: 36
Segmentation fault

I'm on WSL, using Ubuntu 22 on Windows 11. Versioninfo() returns the below. Julia was installed via Juliaup.

Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 2700 Eight-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver1)
  Threads: 1 on 16 virtual cores
@orebas
Copy link
Author

orebas commented Dec 29, 2023

Benjamin Lorenz started to take a look in the #helpdesk channel on slack, and had the below comments:
Benjamin Lorenz
4 days ago
From the backtrace it looks like it might be related to something like this #46214
Moving using Oscar out of the @recompile_invalidations block avoids the crash. I don't know much about all this PrecompileTools stuff but this might be trying to serialize some pointers that happen to be invalid when it is loaded again.

Benjamin Lorenz
4 days ago
Would be interesting to try this with oscar master but unfortunately there are some version incompatibilities

Benjamin Lorenz
3 days ago
I tried with Oscar#master (after a few small adjustments) but it gave the same crash, interestingly it does seem to work with julia nightly

@vchuravy
Copy link
Member

One way of debugging is to build Julia 1.10 with assertions and see if one triggers.

@benlorenz
Copy link
Contributor

benlorenz commented Dec 29, 2023

I got this assertion with a fresh julia 1.10 build:

staticdata.c:910: jl_queue_for_serialization_: Assertion `*bp != (void*)(uintptr_t)-2' failed.

Full output:

Failed to precompile ParameterEstimation [b4cd1eb8-1e24-11e8-3319-93036a3eb9f3] to "~/.julia/compiled/v1.10/ParameterEstimation/jl_kRRdc6".
[ Info: Preproccessing `ModelingToolkit.ODESystem` object
[ Info: Solving the problem
[ Info: Constructing the maximal system
[ Info: Truncating
[ Info: Assessing local identifiability
[ Info: Found Pivots: []
[ Info: Locally identifiable parameters: [a, b, c, d, x4, x3, x2, x1]
[ Info: Not identifiable parameters:     []
[ Info: Randomizing
[ Info: Gröbner basis computation
[ Info: Remainder computation
[ Info: === Summary ===
[ Info: Globally identifiable parameters:                 [b, x4, x3, c, a, x2, d, x1]
[ Info: Locally but not globally identifiable parameters: []
[ Info: Not identifiable parameters:                      []
[ Info: ===============
[ Info: Estimating via the interpolators: ["FHD3", "AAA"]
Final Results:
julia: ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:910: jl_queue_for_serialization_: Assertion `*bp != (void*)(uintptr_t)-2' failed.

[23682] signal (6.-6): Aborted
in expression starting at none:0
unknown function (ip: 0x7f34096aaadc)
gsignal at /lib64/libc.so.6 (unknown line)
abort at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x7f34096473d4)
__assert_fail at /lib64/libc.so.6 (unknown line)
jl_queue_for_serialization_ at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:910 [inlined]
jl_queue_for_serialization_ at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:881
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:815
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:738
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:736
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:815
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:738
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:815
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:738
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:754
jl_insert_into_serialization_queue at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:825
jl_serialize_reachable at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:942
jl_save_system_image_to_stream at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:2505
ijl_create_system_image at ~/software/julia/julia/julia-1.10.0-assert/src/staticdata.c:2767
ijl_write_compiler_output at ~/software/julia/julia/julia-1.10.0-assert/src/precompile.c:121
ijl_atexit_hook at ~/software/julia/julia/julia-1.10.0-assert/src/init.c:251
jl_repl_entrypoint at ~/software/julia/julia/julia-1.10.0-assert/src/jlapi.c:732
main at ~/software/julia/julia/julia-1.10.0-assert/cli/loader_exe.c:58
unknown function (ip: 0x7f34096489c9)
__libc_start_main at /lib64/libc.so.6 (unknown line)
_start at ~/software/julia/julia/julia-1.10.0-assert/usr/bin/julia (unknown line)
Allocations: 395716360 (Pool: 395254451; Big: 461909); GC: 9

Edit:
This happens with ParameterEstimation master and following dependency versions:

(ParameterEstimation) pkg> st
Project ParameterEstimation v0.3.0
Status `~/.julia/dev/ParameterEstimation/Project.toml`
  [7e558dbc] ArbNumerics v1.3.3
  [91aaffc3] BaryRational v1.1.0
  [0c46a032] DifferentialEquations v7.12.0
  [f6369f11] ForwardDiff v0.10.36
  [0b43b601] Groebner v0.5.1
  [f213a82b] HomotopyContinuation v2.9.2
  [7ed4a6bd] LinearSolve v2.22.0
  [961ee093] ModelingToolkit v8.75.0
  [bac558e1] OrderedCollections v1.6.3
  [f1435218] Oscar v0.13.0
  [3e851597] ParamPunPam v0.2.3
  [aea7be01] PrecompileTools v1.2.0
  [92933f4c] ProgressMeter v1.9.0
  [cf7bdac0] SIAN v1.4.2
  [220ca800] StructuralIdentifiability v0.5.1
  [fd094767] Suppressor v0.2.6
  [0c5d862f] Symbolics v5.14.0
  [b36ab563] TaylorDiff v0.2.1
  [6aa5eb33] TaylorSeries v0.15.4
  [98d24dd4] TestSetExtensions v2.0.0
⌃ [3eaa8342] libcxxwrap_julia_jll v0.11.1+0 ⚲
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [de0858da] Printf
  [8dfed614] Test

I needed to pin libcxxwrap_julia_jll to v0.11.1 to avoid a different (known) issue.

@vchuravy vchuravy added the bug Indicates an unexpected problem or unintended behavior label Dec 29, 2023
@orebas
Copy link
Author

orebas commented Jan 2, 2024

Benjamin, do you think I should put this as an issue in PrecompileTools? Your workaround (taking all of the using statements out the @recompile_invalidations block) makes it seem like an issue there.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 2, 2024

Sounded a bit like a duplicate of #52435. But applying the fix for that wasn't sufficient for me

diff --git a/src/staticdata.c b/src/staticdata.c
index f08b59828d..29e5d0eea3 100644
--- a/src/staticdata.c
+++ b/src/staticdata.c
@@ -643,7 +643,7 @@ static int jl_needs_serialization(jl_serializer_state *s, jl_value_t *v) JL_NOTS
     else if (jl_typetagis(v, jl_uint8_tag << 4)) {
         return 0;
     }
-    else if (jl_typetagis(v, jl_task_tag << 4)) {
+    else if (v == (jl_value_t*)s->ptls->root_task) {
         return 0;
     }
 

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 2, 2024

MWE:

diff --git a/test/precompile.jl b/test/precompile.jl
index e10d896da7..10841717ac 100644
--- a/test/precompile.jl
+++ b/test/precompile.jl
@@ -115,6 +115,8 @@ precompile_test_harness(false) do dir
                   d = den(a)
                   return h
               end
+              abstract type AbstractAlgebraMap{A} end
+              struct GAPGroupHomomorphism{A} <: AbstractAlgebraMap{GAPGroupHomomorphism{A}} end
           end
           """)
     write(Foo2_file,
@@ -130,7 +132,7 @@ precompile_test_harness(false) do dir
     write(Foo_file,
           """
           module $Foo_module
-              import $FooBase_module, $FooBase_module.typeA
+              import $FooBase_module, $FooBase_module.typeA, $FooBase_module.GAPGroupHomomorphism
               import $Foo2_module: $Foo2_module, override, overridenc
               import $FooBase_module.hash
               import Test
@@ -211,6 +213,7 @@ precompile_test_harness(false) do dir
               Base.convert(::Type{Some{Value18343}}, ::Value18343{Some}) = 2
               Base.convert(::Type{Ref}, ::Value18343{T}) where {T} = 3
 
+              const GAPType = GAPGroupHomomorphism{Nothing}
 
               # issue #28297
               mutable struct Result

apparently we forgot to exercise this case, since while I thought we had written the code to handle it, I apparently hadn't

@orebas
Copy link
Author

orebas commented Jan 3, 2024

@vtjnash : 2 questions

  1. Do you know if it is bad to put "using" statements inside a @recompile_invalidations blog? My intention was to fully precompile and cache all the recompilations, especially in Oscar and related packages. But I don't really know what I am doing.
  2. I can workaround this bug, by moving those "using" statements out. I would like to push the worked-around version to the general registry+ dev repo. Will that hamper efforts to fix this bug? (i.e. is it ok if I just do that.)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 4, 2024

I have a standalone MWE, so make any changes you want now and it won't mess that up

vtjnash added a commit that referenced this issue Jan 5, 2024
Handle any sort of cycle encountered in the datatype super fields by
always deferring that field until later and setting a deferred mechanism
for updating the field only after the supertype is ready.

Fix #52660
vtjnash added a commit that referenced this issue Jan 5, 2024
Handle any sort of cycle encountered in the datatype super fields by
always deferring that field until later and setting a deferred mechanism
for updating the field only after the supertype is ready.

Fix #52660
KristofferC pushed a commit that referenced this issue Jan 24, 2024
Handle any sort of cycle encountered in the datatype super fields by
always deferring that field until later and setting a deferred mechanism
for updating the field only after the supertype is ready.

Fix #52660

(cherry picked from commit c94b1a3)
Drvi pushed a commit to RelationalAI/julia that referenced this issue Jun 7, 2024
Handle any sort of cycle encountered in the datatype super fields by
always deferring that field until later and setting a deferred mechanism
for updating the field only after the supertype is ready.

Fix JuliaLang#52660

(cherry picked from commit c94b1a3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants