You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because an isbits union is not isbits, a tuple or struct containing an isbits union is not isbits. This means that those tuples or structs will currently be heap allocated. This shouldn't be necessary though because there are no heap references involved.
In the original isbits union discussion there are comments on the problems with making isbits unions true isbits types (and at that point it was decided to introduce a separate category isbitsunion). I'm not sure what those complications are and if they particularly relate to stack allocation. Perhaps stack allocation is still possible with making isbits union true isbitstypes?
Examples illustrating the problem:
size = 10000000
# Array of isbits union. No allocation issue
fn_union(x::Int) = x > 10 ? 1 : 2.0
fn_union_loop() = [fn_union(e) for e in 1:size]
# Array of tuples with isbits union. Will heap allocate tuples
fn_union_tuple(x::Int) = (x, fn_union(x))
fn_union_tuple_loop() = [fn_union_tuple(e) for e in 1:size]
# Array of structs with manual tag. No allocations
struct Foo1
tag :: Bool
x :: Int64
y :: Float64
Foo1(x::Int64) = new(true, x, 0.0)
Foo1(x::Float64) = new(false, 0, x)
end
fn_union_struct1(x::Int) = Foo1(if iseven(x) x else Float64(x) end)
fn_union_struct1_loop() = [fn_union_struct1(e) for e in 1:size]
# Array of structs with isbits union. Will heap allocate the structs.
struct Foo2
x :: Union{Int64, Float64}
y :: Int64
end
fn_union_struct2(x) = Foo2(if iseven(x) x else Float64(x) end, x)
fn_union_struct2_loop() = [fn_union_struct2(e) for e in 1:size]
println("Array of isbits union:" )
for i in 1:3
@time fn_union_loop()
end
println("Array of tuples containing an isbits union:" )
for i in 1:3
@time fn_union_tuple_loop()
end
println("Array of structs with manual tagging of a union:" )
for i in 1:3
@time fn_union_struct1_loop()
end
println("Array of structs containing an isbits union:" )
for i in 1:3
@time fn_union_struct2_loop()
end
Output:
Array of isbits union:
0.171180 seconds (128.47 k allocations: 158.968 MiB, 2.82% gc time)
0.118740 seconds (18 allocations: 152.588 MiB, 42.04% gc time)
0.107816 seconds (18 allocations: 152.588 MiB, 36.16% gc time)
Array of tuples containing an isbits union:
2.295431 seconds (20.24 M allocations: 698.719 MiB, 20.41% gc time)
1.987941 seconds (20.00 M allocations: 686.639 MiB, 14.47% gc time)
1.988189 seconds (20.00 M allocations: 686.639 MiB, 14.47% gc time)
Array of structs with manual tagging of a union:
0.108185 seconds (66.01 k allocations: 232.131 MiB, 1.50% gc time)
0.101095 seconds (4 allocations: 228.882 MiB, 23.12% gc time)
0.114195 seconds (4 allocations: 228.882 MiB, 34.39% gc time)
Array of structs containing an isbits union:
0.486259 seconds (10.07 M allocations: 384.890 MiB, 52.27% gc time)
0.332681 seconds (10.00 M allocations: 381.470 MiB, 38.94% gc time)
0.522927 seconds (10.00 M allocations: 381.470 MiB, 61.60% gc time)
Issue is reproducible on both 0.7 and nightly from today:
Julia Version 1.1.0-DEV.299
Commit 6354405908 (2018-09-20 13:59 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
It is peculiar that the performance difference for tuples is significantly worse than for structs.
Possibly this is because of the isbits union field optimization not being applied to tuples? Possibly that should be a separate feature request?
The text was updated successfully, but these errors were encountered:
Thanks. The array of struct-with-union layout issue should be fairly tractable. The tricky part is that code probably exists that assumes isbits is synonymous with being allocated inline and having a C-compatible layout. We will have to separate those concepts (remarked on briefly here: #23567 (comment)).
Also note that fn_union_loop returns a Vector{Real} and fn_union_tuple_loop returns a Vector{Tuple{Int,Real}}. We don't merge Int and Float64 into a Union type by default. So this depends a bit on what types you have.
Tuples are tricky because of covariance --- unlike a struct with a Union-typed field, Tuple{Union{A,B}} is an abstract type and so does not have a fixed layout. I'm not sure what to do about that. One solution could be to normalize such types to unions of tuples when used as element types.
Because an isbits union is not isbits, a tuple or struct containing an isbits union is not isbits. This means that those tuples or structs will currently be heap allocated. This shouldn't be necessary though because there are no heap references involved.
In the original isbits union discussion there are comments on the problems with making isbits unions true isbits types (and at that point it was decided to introduce a separate category
isbitsunion
). I'm not sure what those complications are and if they particularly relate to stack allocation. Perhaps stack allocation is still possible with making isbits union true isbitstypes?Examples illustrating the problem:
Output:
Issue is reproducible on both 0.7 and nightly from today:
It is peculiar that the performance difference for tuples is significantly worse than for structs.
Possibly this is because of the isbits union field optimization not being applied to tuples? Possibly that should be a separate feature request?
The text was updated successfully, but these errors were encountered: