Slow compilation and segfault with recursion on highly nested struct #30744

tkf · 2019-01-17T18:38:51Z

@JeffBezanson Thanks a lot for looking into this. Actually, it turned out the script eventually segfaults if I wait long enough (3 to 5 minutes, at least in some machines; I'll try to find the difference in setup). This should reproduce the issue:

]add Transducers#long-compilation
using Transducers
@time include(joinpath(dirname(dirname(pathof(Transducers))), "examples", "tutorial_missings.jl"))

If I run julia with --compile=min or more precisely with julia --startup-file=no --compiled-modules=no --compile=min -e 'using Transducers; include(joinpath(dirname(dirname(pathof(Transducers))), "examples", "tutorial_missings.jl"))' the output is something like

 input: (1, 10)
output: 10
 input: (2, 12)
output: 24
 input: (3, 14)
output: 42
  0.095141 seconds (100.05 k allocations: 6.278 MiB)
ans = (2, 3.0)
  0.226999 seconds (128.36 k allocations: 8.364 MiB, 21.15% gc time)
ans = ((3, -1.0), (2, 3.0))
  0.131456 seconds (91.80 k allocations: 6.101 MiB)
ans = (3, 2)

but without --compile=min the script should hang after printing ans = (2, 3.0).

Only the last section of the script is needed to demonstrate the slow compilation. So a smaller example than tutorial_missings.jl would be:

using Transducers
argext_step(should_update) =
    ((oldindex, oldvalue), (index, value)) ->
        should_update(oldvalue, value) ? (index, value) : (oldindex, oldvalue)
init_helper(::typeof(>), ::Type{Tuple{F, S}}) where {F, S} = (zero(F), typemax(S))
init_helper(::typeof(<), ::Type{Tuple{F, S}}) where {F, S} = (zero(F), typemin(S))
argext_init(should_update) = Initializer(TT -> init_helper(should_update, TT))
xf_scanext(should_update) = Scan(argext_step(should_update),
                                 argext_init(should_update))
xf_fullextrema(xf_filter = NotA(Missing)) =
    Zip(Count(), xf_filter) |> Zip(xf_scanext(>), xf_scanext(<))
mapfoldl(xf_fullextrema(), right, [1.0, 3.0, -1.0, missing, 2.0])  # slow
mapfoldl(xf_fullextrema(), right, [1.0, 3.0, -1.0, 2.0])  # slower

When it segfaults, the output is something like:

julia> using Transducers

julia> @time include(joinpath(dirname(dirname(pathof(Transducers))), "examples", "tutorial_missings.jl"))
 input: (1, 10)
output: 10
 input: (2, 12)
output: 24
 input: (3, 14)
output: 42
  1.499609 seconds (3.35 M allocations: 152.190 MiB, 6.37% gc time)
ans = (2, 3.0)

signal (11): Segmentation fault
in expression starting at /home/takafumi/.julia/dev/Transducers/examples/tutorial_missings.jl:324
unknown function (ip: 0x7f99d197be50)
unknown function (ip: 0x7f99d1b6d5c2)
unknown function (ip: 0x7f99d1b6d73f)
unknown function (ip: 0x7f99d1b6ef1a)
unknown function (ip: 0x7f99d1b78cd1)
unknown function (ip: 0x7f99d1b7ad2e)
unknown function (ip: 0x7f99d1b7e67c)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/takafumi/repos/watch/julia/usr/bin/../lib/libLLVM-6.0.so (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/takafumi/repos/watch/julia/usr/bin/../lib/libLLVM-6.0.so (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/takafumi/repos/watch/julia/usr/bin/../lib/libLLVM-6.0.so (unknown line)
unknown function (ip: 0x7f99d1b4fc9d)
unknown function (ip: 0x7f99d1b53779)
unknown function (ip: 0x7f99d1b55b95)
unknown function (ip: 0x7f99d1af6248)
unknown function (ip: 0x7f99d1b48fe6)
jl_fptr_trampoline at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
transduce at /home/takafumi/.julia/dev/Transducers/src/processes.jl:227 [inlined]
#transduce#45 at /home/takafumi/.julia/dev/Transducers/src/processes.jl:204
unknown function (ip: 0x7f99ac0ef739)
jl_fptr_trampoline at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
#transduce at ./none:0 [inlined]
#mapfoldl#46 at /home/takafumi/.julia/dev/Transducers/src/processes.jl:233
unknown function (ip: 0x7f99ac0ef4e9)
jl_fptr_trampoline at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
mapfoldl at /home/takafumi/.julia/dev/Transducers/src/processes.jl:233
unknown function (ip: 0x7f99ac0ef31d)
jl_fptr_trampoline at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99d1bea716)
unknown function (ip: 0x7f99d1bea460)
unknown function (ip: 0x7f99d1beb0de)
unknown function (ip: 0x7f99d1beb7b2)
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f99c4761bdf)
unknown function (ip: 0x3)
unknown function (ip: 0x7f99d1bebc7b)
unknown function (ip: 0x7f99d1ab4e72)
unknown function (ip: 0x7f99d1a8e62d)
jl_load at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99c69c6f30)
unknown function (ip: 0x7f99c69d25cb)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
include at ./client.jl:418
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99d1bea716)
unknown function (ip: 0x7f99d1bea460)
unknown function (ip: 0x7f99d1beb369)
unknown function (ip: 0x7f99d1beb7b2)
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f99c315f93f)
unknown function (ip: 0x5)
unknown function (ip: 0x7f99d1bebc7b)
unknown function (ip: 0x7f99d1ab4e72)
unknown function (ip: 0x7f99d1ab518e)
jl_toplevel_eval_in at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99c69d2a21)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99c6b6c77f)
run_backend at /home/takafumi/.julia/packages/Revise/yp5KG/src/Revise.jl:769
#58 at ./task.jl:259
jl_fptr_trampoline at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
jl_apply_generic at /home/takafumi/repos/watch/julia/usr/bin/../lib/libjulia.so.1 (unknown line)
unknown function (ip: 0x7f99d1a9c510)
unknown function (ip: 0xffffffffffffffff)
Allocations: 306876904 (Pool: 306871095; Big: 5809); GC: 364

The text was updated successfully, but these errors were encountered:

JeffBezanson · 2019-01-17T19:08:15Z

Wow, this seems to involve a type so large we basically can't (or shouldn't) generate specialized code for it. Is this supposed to be a pathological case, or is it fairly normal transducer code?

With --inline=no compilation is still a bit slow, but the code finishes in less than 2 minutes. Maybe we just need an inlining limit based on function size.

tkf · 2019-01-17T20:11:55Z

This is a useful example that I hope I can support. But indeed I'm a bit lazy in Transducers.jl internal not trying hard enough to make the struct shallower 😅 (I'm planning on doing that). I tried using a flat tuple instead of a "linked list" style nested struct but it seemed that inference gave up determining a concrete type well when I constantly re-creating tuples.

Actually, I have a question: Is it better to use "flat" style like Nodes{Tuple{Node1{T1}, Node2{T2}, Node3{T3}}} instead of "linked list" style like Node1{T1, Node2{T2, Node3{T3, Nothing}}}? Or what matter is simply the depth of recursion call (so that the representation of the data structure does not matter)? I'd like to make it compiler-friendly so that Transducers.jl can be inferred and inlined as much as possible.

tkf · 2019-01-17T20:22:49Z

Just to clarify a bit more, a call chain (top to bottom) of the flat style would be:

next(::Nodes{Tuple{Node1{T1}, Node2{T2}, Node3{T3}}}, ...)
next(::Nodes{Tuple{Node2{T2}, Node3{T3}}}, ...)
next(::Nodes{Tuple{Node3{T3}}}, ...)

i.e., each next has to pop off the first node, reconstruct the tuple, and then bundle them in a struct Nodes.

On the other hand, a call chain of the linked list style would be:

next(::Node1{T1, Node2{T2, Node3{T3, Nothing}}}, ...)
next(::Node2{T2, Node3{T3, Nothing}}, ...)
next(::Node3{T3, Nothing}, ...)

It's nicer since I don't have to re-construct the structs. But I worry if it is not compiler-friendly.

JeffBezanson · 2019-01-17T21:07:07Z

The flat style is probably better. We have moved to that in Base.Iterators, for example, for Zip and Product.

tkf · 2019-01-18T03:02:20Z

Thanks. Hmm.... So I guess I have to try flat style again at some point.

But it looked like to me that the inference in linked list style was much better than the flat one. Or maybe rather the compiler was "forced" to continue inference because there is no limit? Or maybe simply my code had performance bugs...

Anyway, I suppose that the plan for this issue is that keeping it open until there is some inference/inline limit. (So I'm not touching the close button ATM.)

JeffBezanson · 2019-01-18T03:05:03Z

It's quite possible there are things we can improve in inference to make the "flat" approach work better. As always, the answer is to file more issues :)

tkf · 2019-01-18T03:32:42Z

the answer is to file more issues

I can do that 👍

(But, to be honest, I personally prefer if julia handle recursive data structure as good as tuples.)

tkf mentioned this issue Jan 17, 2019

<: vs === speed for types #30125

Open

tkf changed the title ~~Slow compilation with recursion on highly nested struct~~ Slow compilation and segfault with recursion on highly nested struct Jan 17, 2019

brenhinkeller added the performance Must go faster label Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow compilation and segfault with recursion on highly nested struct #30744

Slow compilation and segfault with recursion on highly nested struct #30744

tkf commented Jan 17, 2019

JeffBezanson commented Jan 17, 2019

tkf commented Jan 17, 2019

tkf commented Jan 17, 2019

JeffBezanson commented Jan 17, 2019

tkf commented Jan 18, 2019

JeffBezanson commented Jan 18, 2019

tkf commented Jan 18, 2019

Slow compilation and segfault with recursion on highly nested struct #30744

Slow compilation and segfault with recursion on highly nested struct #30744

Comments

tkf commented Jan 17, 2019

JeffBezanson commented Jan 17, 2019

tkf commented Jan 17, 2019

tkf commented Jan 17, 2019

JeffBezanson commented Jan 17, 2019

tkf commented Jan 18, 2019

JeffBezanson commented Jan 18, 2019

tkf commented Jan 18, 2019