SharedArray access in `@parallel for` causes Segmentation fault #14764

pgawron · 2016-01-22T16:44:05Z

Following code always causes Segmentation fault when run as parallel julia -p 20. When run on single process it does not crash.

Unfortunately the moment of crash may vary.

function bug{T<:AbstractFloat}(data::Matrix{T})
    n = size(data, 2)
    cumulantT4 = SharedArray(T, n, n, n, n)

    for i = 1:n, j = i:n, k = j:n
      println("$i $j $k")
      @sync @parallel for l = k:n
            a = 1.0
            cumulantT4[i,j,k,l] = a
            cumulantT4[l,j,k,i] = a
            cumulantT4[i,l,k,j] = a
            cumulantT4[i,j,l,k] = a

            cumulantT4[i,k,j,l] = a
            cumulantT4[l,k,j,i] = a
            cumulantT4[i,l,j,k] = a
            cumulantT4[i,k,l,j] = a

            cumulantT4[j,i,k,l] = a
            cumulantT4[l,i,k,j] = a
            cumulantT4[j,l,k,i] = a
            cumulantT4[j,i,l,k] = a

            cumulantT4[j,k,i,l] = a
            cumulantT4[l,k,i,j] = a
            cumulantT4[j,l,i,k] = a
            cumulantT4[j,k,l,i] = a

            cumulantT4[k,i,j,l] = a
            cumulantT4[l,i,j,k] = a
            cumulantT4[k,l,j,i] = a
            cumulantT4[k,i,l,j] = a

            cumulantT4[k,j,i,l] = a
            cumulantT4[l,j,i,k] = a
            cumulantT4[k,l,i,j] = a
            cumulantT4[k,j,l,i] = a
        end
    end
    return Array(cumulantT4)
end

bug(randn(2000,100))

The stack I get is following

signal (11): Segmentation fault
jl_gc_allocobj at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
jl_alloc_array_1d at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
__notify#32__ at ./task.jl:296
unlock at ./lock.jl:36
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
send_msg_ at multi.jl:230
remotecall_fetch at multi.jl:728
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
jl_f_apply at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
remotecall_fetch at multi.jl:734
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
jl_f_apply at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
call_on_owner at multi.jl:777
wait at multi.jl:792
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
sync_end at ./task.jl:400
bug at task.jl:422
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f9755413683)
unknown function (ip: 0x7f9755412ac1)
unknown function (ip: 0x7f9755427ca8)
unknown function (ip: 0x7f9755428999)
jl_load at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
include at ./boot.jl:261
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
include_from_node1 at ./loading.jl:304
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
process_options at ./client.jl:284
_start at ./client.jl:378
unknown function (ip: 0x7f97526e9f49)
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
unknown function (ip: 0x401b09)
unknown function (ip: 0x4016df)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401725)
unknown function (ip: (nil))
Segmentation fault (core dumped)

julia> versioninfo()
Julia Version 0.4.2
Commit bb73f34 (2015-12-06 21:47 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (NO_LAPACK NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: liblapack.so.3
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

julia>

The text was updated successfully, but these errors were encountered:

yuyichao · 2016-01-22T19:02:29Z

I've reproduced it a few times on 0.4.2 but not on the master. Can you give the 0.4.3 and the nightly binary a try and see if that makes any difference?

pgawron · 2016-01-22T21:47:19Z

In versions 0.4.3 and 0.4.4-pre+2 the bug still appears.

Julia Version 0.4.4-pre+2
Commit a85c3a0* (2016-01-18 02:17 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

In the latest version (versioninfo below) I can't reproduce the bug.

Julia Version 0.5.0-dev+2248
Commit df928bd (2016-01-22 16:21 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1

pearcemc · 2016-01-25T15:24:54Z

I also cannot reproduce this on a newish version.

julia> versioninfo()
Julia Version 0.5.0-dev+749
Commit 83eac1e* (2015-10-13 16:00 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT NO_AFFINITY NEHALEM)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

amitmurthy · 2016-01-30T07:12:13Z

FWIW, the same with 3 runs on OSX and 0.4.3-pre+6

1 55 64
julia(82643,0x7fff72944300) malloc: *** error for object 0x1149cb490: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

signal (6): Abort trap: 6
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Abort trap: 6

2 18 70
2 18 71

signal (11): Segmentation fault: 11
__pool_alloc at /Users/amitm/Julia/julia_0.4/julia/src/gc.c:1056
_new_array_ at /Users/amitm/Julia/julia_0.4/julia/src/array.c:84
_new_array at /Users/amitm/Julia/julia_0.4/julia/src/array.c:333
copy at ./array.jl:100
flush_gc_msgs at ./multi.jl:191
send_msg_ at ./multi.jl:225
flush_gc_msgs at ./multi.jl:182
send_msg_ at multi.jl:225
remotecall at multi.jl:710
remotecall at multi.jl:714
pfor at multi.jl:1521
anonymous at multi.jl:1541
bug at task.jl:421
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/gf.c:1691
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/interpreter.c:55
eval at /Users/amitm/Julia/julia_0.4/julia/src/interpreter.c:213
jl_toplevel_eval_flex at /Users/amitm/Julia/julia_0.4/julia/src/toplevel.c:527
jl_toplevel_eval_in at /Users/amitm/Julia/julia_0.4/julia/src/builtins.c:579
eval_user_input at REPL.jl:62
jlcall_eval_user_input_21511 at  (unknown line)
anonymous at REPL.jl:92
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/./julia.h:1325
Segmentation fault: 11

1 24 47
1 24 48

signal (11): Segmentation fault: 11
__pool_alloc at /Users/amitm/Julia/julia_0.4/julia/src/gc.c:1056
_new_array_ at /Users/amitm/Julia/julia_0.4/julia/src/array.c:84
_new_array at /Users/amitm/Julia/julia_0.4/julia/src/array.c:333
filter at ./array.jl:938
workers at ./multi.jl:327
chooseproc at multi.jl:1341
pfor at multi.jl:1521
anonymous at multi.jl:1541
bug at task.jl:421
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/gf.c:1691
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/interpreter.c:55
eval at /Users/amitm/Julia/julia_0.4/julia/src/interpreter.c:213
jl_toplevel_eval_flex at /Users/amitm/Julia/julia_0.4/julia/src/toplevel.c:527
jl_toplevel_eval_in at /Users/amitm/Julia/julia_0.4/julia/src/builtins.c:579
eval_user_input at REPL.jl:62
jlcall_eval_user_input_21522 at  (unknown line)
anonymous at REPL.jl:92
jl_apply at /Users/amitm/Julia/julia_0.4/julia/src/./julia.h:1325
Segmentation fault: 11

minsuc · 2016-03-15T22:54:39Z

I also have a similar issue when using shared array with remotecall_wait (with 10 workers). The code runs for a while but crashes. (The timing of crash is random...) I got the following message

signal (11): Segmentation fault
jl_gc_allocobj at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
jl_alloc_array_1d at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
__notify#32__ at ./task.jl:296
unlock at ./lock.jl:36
jl_apply_generic at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
send_msg_ at multi.jl:230
remotecall_wait at multi.jl:751
jl_apply_generic at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
jl_f_apply at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
remotecall_wait at multi.jl:758
jl_apply_generic at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
jl_f_apply at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
remotecall_wait at /home/minsuc/.julia/v0.4/Compat/src/Compat.jl:743
jl_apply_generic at /data/global/julia-a2f713dea5/bin/../lib/julia/libjulia.so (unknown line)
anonymous at task.jl:447
unknown function (ip: 0x7f1a22ae4514)
unknown function (ip: (nil))
Segmentation fault (core dumped)

The julia version is

Julia Version 0.4.3
Commit a2f713d (2016-01-12 21:37 UTC)
Platform Info:
System: Linux (x86_64-unknown-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.3

Is this issue still open?

amitmurthy · 2016-03-16T03:39:36Z

Yes.

amitmurthy · 2016-03-16T03:49:00Z

On master it now hangs after sometime, and a Ctrl-C results in

1 18 68
^CERROR: InterruptException:
 in rehash!(::Dict{Any,Any}, ::Int64) at ./dict.jl:540
 in ht_keyindex2(::Dict{Any,Any}, ::WeakRef) at ./dict.jl:661
 in setindex!(::Dict{Any,Any}, ::Bool, ::WeakRef) at ./dict.jl:691
 in setindex!(::WeakKeyDict{Any,Any}, ::Bool, ::Future) at ./dict.jl:865
 in test_existing_ref(::Future) at ./multi.jl:509
 in Future(::Int64, ::Int64, ::Int64) at ./multi.jl:477
 in Future(::Int64) at ./multi.jl:545
 in Future(::Base.Worker) at ./multi.jl:542
 in remotecall(::Any, ::Base.Worker) at ./multi.jl:807
 in remotecall(::Any, ::Int64) at ./multi.jl:813
 [inlined code] from ./multi.jl:1486
 in pfor(::##2#4, ::UnitRange{Int64}) at ./multi.jl:1660
 [inlined code] from ./multi.jl:1680
 in bug(::Array{Float64,2}) at ./none:7
 in eval(::Module, ::Any) at ./boot.jl:267

The master process is in a busy loop state at the time of interruption.

matthewozon · 2016-04-29T13:14:00Z

Hi everyone, I think I have a similar issue using the parallelism tools
of julia. I have tried several configurations so that it may help fixing
the bug.
I am running my codes on a machine whose main characteristics are
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz and 95GiB DDR3 1333 MHz
and that runs a Mint 17.3 (Rosa).

First, I found an issue in my code arising from the use of the object
SharedArray and the tags @parrallel, @sync and @Everywhere. Depending on
the julia version, I have the following problems that may or may not
happen (it is a bit random):

versioninfo()

Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
WORD_SIZE: 64
BLAS: libopenblas (NO_LAPACK NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY
Sandybridge)
LAPACK: liblapack.so.3
LIBM: libopenlibm
LLVM: libLLVM-3.3

a) when the code is stuck, at some point I lose patience and hit Ctrl+C,
the error message is (always) stemming from the rehash! function of the
dict.jl file.
b) when the code is stuck and stops, but without hitting Ctrl+C
i) either a segfault appears (signal 11) from jl_gc_allocobj
ii) or an Aborted (signal 6) coming also with a GC error

and with another version:

Julia Version 0.5.0-dev+3660
Commit 5aa0d52* (2016-04-20 14:02 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, sandybridge)

I must change a little the code, but basically, it is the same. The
error arises from the function rehash! (dict.jl).

To me, it looks like the program is in an infinite loop (not exactly
infinite, but really long).

So, from here, I started mining information from the forums and ended on
this topic which is quite recent, and still open. So I tried the
following code that is almost the same as the one proposed above by
@pgawron :

function bug{T<:AbstractFloat}(data::Matrix{T})
     n = size(data, 2);
     cumulantT4 = SharedArray(T, n, n, n, n);

     for i = 1:n, j = i:n, k = j:n
       #println("$i $j $k")
       @sync @parallel for l = k:n
             a = 1.0;
             cumulantT4[i,j,k,l] = a;
             cumulantT4[l,j,k,i] = a;
             cumulantT4[i,l,k,j] = a;
             cumulantT4[i,j,l,k] = a;

             cumulantT4[i,k,j,l] = a;
             cumulantT4[l,k,j,i] = a;
             cumulantT4[i,l,j,k] = a;
             cumulantT4[i,k,l,j] = a;

             cumulantT4[j,i,k,l] = a;
             cumulantT4[l,i,k,j] = a;
             cumulantT4[j,l,k,i] = a;
             cumulantT4[j,i,l,k] = a;

             cumulantT4[j,k,i,l] = a;
             cumulantT4[l,k,i,j] = a;
             cumulantT4[j,l,i,k] = a;
             cumulantT4[j,k,l,i] = a;

             cumulantT4[k,i,j,l] = a;
             cumulantT4[l,i,j,k] = a;
             cumulantT4[k,l,j,i] = a;
             cumulantT4[k,i,l,j] = a;

             cumulantT4[k,j,i,l] = a;
             cumulantT4[l,j,i,k] = a;
             cumulantT4[k,l,i,j] = a;
             cumulantT4[k,j,l,i] = a;
         end
     end
     Array(cumulantT4);
end

N=10 #and 1000
Nrep=1000
A=randn(2000,N);
#run many times so that it is almost sure that it produces an error
for q in collect(1:Nrep)
     @time B=bug(A);
end

I have the following behavior that is really similar to what @pgawron show in his post.

Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
WORD_SIZE: 64
BLAS: libopenblas (NO_LAPACK NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY
Sandybridge)
LAPACK: liblapack.so.3
LIBM: libopenlibm
LLVM: libLLVM-3.3

if N=10 and nprocs=1 at some point there is a memory problem (gc?), the
allocations are way too numerous, and I really feel lucky to be provided
with that much RAM so that I can have a result with no crash. I show
here just a subset of the @time commands (look at the 4th line compared
to the other):

   [...]
   0.068994 seconds (320.67 k allocations: 9.539 MB, 15.89% gc time)
   0.058838 seconds (325.56 k allocations: 9.199 MB)
   0.059699 seconds (336.36 k allocations: 9.365 MB)
   [...]
   1692.182702 seconds (31.76 G allocations: 473.541 GB, 7.27% gc time)
   [...]
   0.181083 seconds (738.33 k allocations: 15.736 MB, 47.56% gc time)
   0.115220 seconds (977.64 k allocations: 19.154 MB)
   0.130714 seconds (1.07 M allocations: 20.573 MB)
   0.095628 seconds (772.12 k allocations: 16.017 MB)
   0.247164 seconds (1.11 M allocations: 22.078 MB, 48.38% gc time)
   0.124824 seconds (1.05 M allocations: 20.194 MB)
   0.320392 seconds (760.92 k allocations: 17.291 MB, 71.24% gc time)
   [...]

If I set N=1000, even with all the RAM, I cannot reach the end of the
first loop and the program crashes with a message concerning a Bus error
(signal 7). I must mention that the virtual memory allocated for running
the code is about 17*10^9 GiB... I do not have that kind of memory.

In a multithreaded mode (julia -p 7), the code with N=10 crashes after a
few iterations with a message of Segmentation fault (jl_gc_allocobj).
For N=1000, all workers, but one, are destroyed quite fast (nprocs()
returns 1) and the Bus error appears for each worker, but julia still
runs. I must explicitly call gc() to free the memory.

So, my question is the following: Is there any way to avoid these issues
and still have a parallel algorithm? I can run other tests if it can
help the case.

Thx

Matthew

amitmurthy · 2016-04-29T13:21:41Z

Probably the same underlying cause as #15923 .

matthewozon · 2016-04-29T13:35:25Z

Indeed, it looks really similar to #15923 . I tried to disable the gc, but I ran out of memory really quickly.

stephenbach · 2016-05-05T00:00:18Z

Have exactly this issue, and experiencing all three types of failures decsribed by @matthewozon :

a) when the code is stuck, at some point I lose patience and hit Ctrl+C,
the error message is (always) stemming from the rehash! function of the
dict.jl file.
b) when the code is stuck and stops, but without hitting Ctrl+C
i) either a segfault appears (signal 11) from jl_gc_allocobj
ii) or an Aborted (signal 6) coming also with a GC error

Version info is

Julia Version 0.4.5
Commit 2ac304d (2016-03-18 00:58 UTC)
Platform Info:
System: Linux (x86_64-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-4657L v2 @ 2.40GHz
WORD_SIZE: 64
BLAS: libopenblas (NO_LAPACK NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY)
LAPACK: liblapack.so.3
LIBM: libopenlibm
LLVM: libLLVM-3.3

A bit about the code: I'm passing vectors of SharedArrays to workers via calls that look like

@sync for i = 1:n_chunks
  @async remotecall_wait(procs[i], function, vector_of_shared_arrays, i)
end

mauro3 · 2016-06-23T07:47:43Z

On 0.4.6, with the test-script of #14764 (comment) I see at times the first error pasted here #14764 (comment).

But I also saw this:

  0.488692 seconds (1.47 M allocations: 108.374 MB, 4.55% gc time)
ERROR: LoadError: BoundsError: attempt to access 0-element Array{Any,1}
  at index [2]
 in notify at ./task.jl:299
 in __notify#32__ at ./task.jl:296
 in send_add_client at multi.jl:608
 in serialize at serialize.jl:185
 in serialize at sharedarray.jl:343
 in serialize at serialize.jl:127
 in serialize at serialize.jl:310
 in serialize at serialize.jl:127
 in serialize at serialize.jl:310
 in serialize_any at serialize.jl:422
 in send_msg_ at multi.jl:222
 in remotecall at multi.jl:726
 in remotecall at multi.jl:730
 in pfor at multi.jl:1537
 [inlined code] from multi.jl:1557
 in anonymous at expr.jl:113
 [inlined code] from task.jl:421
 in bug at issue14764.jl:54
 [inlined code] from util.jl:155
 in anonymous at no file:0
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:320
while loading issue14764.jl, in expression starting on line 94

julia> fatal error on 3: ERROR: MethodError: `convert` has no method matching convert(::Type{RemoteRef{T<:AbstractChannel}}, ::Base.RemoteDoMsg)
This may have arisen from a call to the constructor RemoteRef{T<:AbstractChannel}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  call{T}(::Type{T}, ::Any)
  convert{T}(::Type{T}, !Matched::T)
  RemoteRef()
  ...
 in setindex! at array.jl:314
 in deserialize_array at serialize.jl:616
 in handle_deserialize at serialize.jl:465
 in deserialize at serialize.jl:696
 in deserialize at sharedarray.jl:349
 in deserialize_datatype at serialize.jl:651
 in handle_deserialize at serialize.jl:467
 in deserialize at serialize.jl:484
 in handle_deserialize at serialize.jl:477
 in deserialize at serialize.jl:540
 in handle_deserialize at serialize.jl:477
 in deserialize at serialize.jl:484
 in handle_deserialize at serialize.jl:477
 in deserialize at serialize.jl:540
 in handle_deserialize at serialize.jl:477
 in deserialize at serialize.jl:696
 in deserialize_datatype at serialize.jl:651
 in handle_deserialize at serialize.jl:467
 in message_handler_loop at multi.jl:878
 in process_tcp_streams at multi.jl:867
 in anonymous at task.jl:63
Worker 3 terminated.
ERROR (unhandled task failure): EOFError: read end of file

which killed one worker (of 4).

And sometimes this somewhat different segfault:

signal (11): Segmentation fault
jl_gc_allocobj at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
jl_alloc_array_1d at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
julia_notify_21601 at  (unknown line)
jlcall_notify_21601 at  (unknown line)
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
uv_readcb at ./stream.jl:516
unknown function (ip: 0x7fd783258739)
unknown function (ip: 0x7fd78622561d)
unknown function (ip: 0x7fd786225b7c)
uv__io_poll at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
uv_run at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
process_events at ./stream.jl:733
wait at ./task.jl:374
stream_wait at ./stream.jl:60
uv_write at stream.jl:962
flush at stream.jl:997
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
send_msg_ at multi.jl:227
remotecall_fetch at multi.jl:744
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
jl_f_apply at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
remotecall_fetch at multi.jl:750
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
jl_f_apply at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
call_on_owner at multi.jl:793
wait at multi.jl:808
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
sync_end at ./task.jl:400
bug at task.jl:422
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
anonymous at util.jl:155
unknown function (ip: 0x7fd7861ec753)
unknown function (ip: 0x7fd7861ed36c)
jl_load at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
include at ./boot.jl:261
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
include_from_node1 at ./loading.jl:320
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: 0x7fd7861d7e96)
unknown function (ip: 0x7fd7861d71ff)
unknown function (ip: 0x7fd7861ec69c)
jl_toplevel_eval_in at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
eval_user_input at REPL.jl:62
jlcall_eval_user_input_21843 at  (unknown line)
jl_apply_generic at /home/mauro/julia/julia-0.4/usr/bin/../lib/libjulia.so (unknown line)
anonymous at REPL.jl:92
unknown function (ip: 0x7fd7861de07f)
unknown function (ip: (nil))
zsh: segmentation fault (core dumped)  julia -p4

vtjnash · 2016-06-23T13:02:54Z

@yuyichao I think this looks like the array/finalizer issue you've dissected?

yuyichao · 2016-06-23T13:19:31Z

Hard to tell from the backtrace but likely.

mauro3 · 2017-12-06T10:59:37Z

Just browsing WeakKeyDict issues: could this be related to #3002?

mauro3 · 2018-07-20T11:34:42Z

On julia-0.7-beta2.0 neither the original example nor #14764 (comment) error for me (I ran those cases each twice). Maybe someone else can check as well?

fisiognomico · 2018-07-20T12:46:21Z

I can confirm that they both work on julia-0.7-beta2.0 and 0.6.4.

tkelman added the parallelism Parallel or distributed computation label Jan 22, 2016

yuyichao added a commit to yuyichao/explore that referenced this issue Jan 22, 2016

[julia/issue-14764] repro for JuliaLang/julia#14764

912733a

ViralBShah closed this as completed Feb 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SharedArray access in `@parallel for` causes Segmentation fault #14764

SharedArray access in `@parallel for` causes Segmentation fault #14764

pgawron commented Jan 22, 2016

yuyichao commented Jan 22, 2016

pgawron commented Jan 22, 2016

pearcemc commented Jan 25, 2016

amitmurthy commented Jan 30, 2016

minsuc commented Mar 15, 2016

amitmurthy commented Mar 16, 2016

amitmurthy commented Mar 16, 2016

matthewozon commented Apr 29, 2016 •

edited

Loading

amitmurthy commented Apr 29, 2016

matthewozon commented Apr 29, 2016

stephenbach commented May 5, 2016

mauro3 commented Jun 23, 2016

vtjnash commented Jun 23, 2016

yuyichao commented Jun 23, 2016

mauro3 commented Dec 6, 2017

mauro3 commented Jul 20, 2018

fisiognomico commented Jul 20, 2018

SharedArray access in @parallel for causes Segmentation fault #14764

SharedArray access in @parallel for causes Segmentation fault #14764

Comments

pgawron commented Jan 22, 2016

yuyichao commented Jan 22, 2016

pgawron commented Jan 22, 2016

pearcemc commented Jan 25, 2016

amitmurthy commented Jan 30, 2016

minsuc commented Mar 15, 2016

amitmurthy commented Mar 16, 2016

amitmurthy commented Mar 16, 2016

matthewozon commented Apr 29, 2016 • edited Loading

amitmurthy commented Apr 29, 2016

matthewozon commented Apr 29, 2016

stephenbach commented May 5, 2016

mauro3 commented Jun 23, 2016

vtjnash commented Jun 23, 2016

yuyichao commented Jun 23, 2016

mauro3 commented Dec 6, 2017

mauro3 commented Jul 20, 2018

fisiognomico commented Jul 20, 2018

SharedArray access in `@parallel for` causes Segmentation fault #14764

SharedArray access in `@parallel for` causes Segmentation fault #14764

matthewozon commented Apr 29, 2016 •

edited

Loading