Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling nested structs #85

Open
tk3369 opened this issue Jul 10, 2019 · 12 comments
Open

Handling nested structs #85

tk3369 opened this issue Jul 10, 2019 · 12 comments

Comments

@tk3369
Copy link

tk3369 commented Jul 10, 2019

Hi Pietro. Great package!

Any thought about how to support nested structs? e.g.

julia> struct Point
           x::Int
           y::Int
       end

julia> struct Foo
           id::Int
           name::String
           pt::Point
       end

julia> x = [Foo(i, "foo$i", Point(2i, 3i)) for i in 1:3]
3-element Array{Foo,1}:
 Foo(1, "foo1", Point(2, 3))
 Foo(2, "foo2", Point(4, 6))
 Foo(3, "foo3", Point(6, 9))

I can do it like this:

julia> x2 = StructArray(
           id = [f.id for f in x],
           name = [f.name for f in x],
           pt = StructArray(f.pt for f in x))
3-element StructArray(::Array{Int64,1}, ::Array{String,1}, StructArray(::Array{Int64,1}, ::Array{Int64,1})) with eltype NamedTuple{(:id, :name, :pt),Tuple{Int64,String,Point}}:
 (id = 1, name = "foo1", pt = Point(2, 3))
 (id = 2, name = "foo2", pt = Point(4, 6))
 (id = 3, name = "foo3", pt = Point(6, 9))

It's a little bit unsatisfactory since I have to enumerate all non-struct fields (id and name in the above) manually...

@piever
Copy link
Collaborator

piever commented Jul 10, 2019

Thanks for the nice words!

Nested structures are supported but unfortunately undocumented (I'm keeping the issue open to remember to add docs)... When calling StructArray you can specify with unwrap the types that you want to apply the array of structs to struct of arrays transform again. For example:

julia> x = [Foo(i, "foo$i", Point(2i, 3i)) for i in 1:3]
3-element Array{Foo,1}:
 Foo(1, "foo1", Point(2, 3))
 Foo(2, "foo2", Point(4, 6))
 Foo(3, "foo3", Point(6, 9))

julia> s = StructArray(x, unwrap = t -> t <: Point)
3-element StructArray(::Array{Int64,1}, ::Array{String,1}, StructArray(::Array{Int64,1}, ::Array{Int64,1})) with eltype Foo:
 Foo(1, "foo1", Point(2, 3))
 Foo(2, "foo2", Point(4, 6))
 Foo(3, "foo3", Point(6, 9))

julia> s.pt
3-element StructArray(::Array{Int64,1}, ::Array{Int64,1}) with eltype Point:
 Point(2, 3)
 Point(4, 6)
 Point(6, 9)

@tk3369
Copy link
Author

tk3369 commented Jul 11, 2019

BTW, I'm writing a design patterns book about Julia, and I'm going to feature your package in the "Struct of Arrays" pattern.

@cscherrer
Copy link

Hi @tk3369 , I have a start on this here:
#160

I think the approach will work, just a question of integrating into what's there. Possible it could need a fair amount of refactoring

@cscherrer
Copy link

Oops, I had missed the solution above. I started working on this after reading #126. @piever could you please close that if it's no longer the case?

I'm not sure whether there would be a benefit to the refactoring I was doing, I'll need to have a closer look

@piever
Copy link
Collaborator

piever commented Dec 10, 2020

Yes, I'm closing #126 because nested structs are not broken and the approach suggested there work.

Just ping me when you PR is ready for review (and of course if it turns out that there is a benefit). OTOH, if the current API allows to do what you need but is underdocumented, a doc PR would be great!

@cscherrer
Copy link

Thanks @piever ,

In the example above, an array is allocated, and then (I assume) garbage collected after it's used to construct the more efficient StructArray. There's a fair amount of over head in this initial allocation, which I'd like to avoid.

Is there a good way to do this? I was trying something like this, which (obviously) doesn't work:

julia> using StructArrays

julia> x = (a=[1,2],b=(c=[3,4],d=([5,6],[7,8])))
(a = [1, 2], b = (c = [3, 4], d = ([5, 6], [7, 8])))

julia> StructArray(x; unwrap = T -> T <: Union{Tuple, NamedTuple})
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1:1]
Stacktrace:
 [1] copyto!(::Array{Any,1}, ::Int64, ::StructArray{Array{Int64,1},1,NamedTuple{(),Tuple{}},Int64}, ::Int64, ::Int64) at ./abstractarray.jl:905
 [2] _widenarray at /home/chad/git/StructArrays.jl/src/collect.jl:126 [inlined]
 [3] _widenstructarray at /home/chad/git/StructArrays.jl/src/collect.jl:114 [inlined]
 [4] widen_from_type at /home/chad/git/StructArrays.jl/src/collect.jl:109 [inlined]
 [5] widen_from_instance(::StructArray{Array{Int64,1},1,NamedTuple{(),Tuple{}},Int64}, ::Int64, ::NamedTuple{(:c, :d),Tuple{Array{Int64,1},Tuple{Array{Int64,1},Array{Int64,1}}}}) at /home/chad/git/StructArrays.jl/src/collect.jl:105
 [6] collect_to_structarray!(::StructArray{Array{Int64,1},1,NamedTuple{(),Tuple{}},Int64}, ::NamedTuple{(:a, :b),Tuple{Array{Int64,1},NamedTuple{(:c, :d),Tuple{Array{Int64,1},Tuple{Array{Int64,1},Array{Int64,1}}}}}}, ::Int64, ::Int64) at /home/chad/git/StructArrays.jl/src/collect.jl:77
 [7] _collect_structarray! at /home/chad/git/StructArrays.jl/src/collect.jl:59 [inlined]
 [8] #_collect_structarray#101 at /home/chad/git/StructArrays.jl/src/collect.jl:54 [inlined]
 [9] collect_structarray(::NamedTuple{(:a, :b),Tuple{Array{Int64,1},NamedTuple{(:c, :d),Tuple{Array{Int64,1},Tuple{Array{Int64,1},Array{Int64,1}}}}}}; initializer::StructArrays.StructArrayInitializer{var"#7#8",typeof(StructArrays.default_array)}) at /home/chad/git/StructArrays.jl/src/collect.jl:40
 [10] StructArray(::NamedTuple{(:a, :b),Tuple{Array{Int64,1},NamedTuple{(:c, :d),Tuple{Array{Int64,1},Tuple{Array{Int64,1},Array{Int64,1}}}}}}; unwrap::Function) at /home/chad/git/StructArrays.jl/src/structarray.jl:81

@piever
Copy link
Collaborator

piever commented Dec 13, 2020

If storage is already column-based, there should be no allocations at all. In your case, I think the easiest is to define a custom helper function to construct things recursively, for example

julia> using StructArrays

julia> to_array(s::AbstractArray) = s
to_array (generic function with 1 method)

julia> to_array(nt::Union{Tuple, NamedTuple}) = StructArray(map(to_array, nt))
to_array (generic function with 2 methods)

julia> x = (a=[1,2],b=(c=[3,4],d=([5,6],[7,8])));

julia> to_array(x)
2-element StructArray(::Array{Int64,1}, StructArray(::Array{Int64,1}, StructArray(::Array{Int64,1}, ::Array{Int64,1}))) with eltype NamedTuple{(:a, :b),Tuple{Int64,NamedTuple{(:c, :d),Tuple{Int64,Tuple{Int64,Int64}}}}}:
 (a = 1, b = (c = 3, d = (5, 7)))
 (a = 2, b = (c = 4, d = (6, 8)))

julia> using BenchmarkTools

julia> @btime to_array($x);
  3.978 ns (0 allocations: 0 bytes)

I am not sure this should happen by default. IMO, if the user wants to do things recursively, they should be explicit about it: it's just one extra line of code. If this is not intuitive, maybe one of these examples could appear in the README (it seems like it has tripped up people before).

@cscherrer
Copy link

Great, thank you!

For a while it seemed StructArrays might not have good support for this, so I started working on https://github.com/cscherrer/NestedTuples.jl.

My use case requires deep nesting, so that was my focus. It's very rough compared to StructArrays, but for deeply nested structures it still seems to have an advantage. Say we have

using StructArrays
using NestedTuples
using Accessors
using BenchmarkTools

to_structarray(s::AbstractArray) = s

to_structarray(nt::Union{Tuple, NamedTuple}) = StructArray(map(to_structarray, nt))

r() = randn(100)

x = (a = r(), b = (c = r(), d= (e = r(), f = (g = r(), h = (i = r(), j = (k = r(), l = (m = r(), n = (o = r(), p = (q = r(), r = (s = r(), t = (u = r(), v = (w = r(), x = (y = r(), z = r())))))))))))));

Then

julia> sa = to_structarray(x);

julia> ta = TupleArray(x);

julia> @btime sa = to_structarray($x);
  26.352 ns (0 allocations: 0 bytes)

julia> @btime ta = TupleArray($x);
  18.108 ns (0 allocations: 0 bytes)

julia> @btime @inbounds $sa[3];
  15.360 ns (0 allocations: 0 bytes)

julia> @btime @inbounds $ta[3];
  9.406 ns (0 allocations: 0 bytes)

julia> @btime @inbounds $sa[2] = $sa[3];
  29.271 ns (0 allocations: 0 bytes)

julia> @btime @inbounds $ta[2] = $ta[3];
  20.179 ns (0 allocations: 0 bytes)

It's not clear to me where to go form here. Operations like this will be in inner loops, so I need as much speed as I can get. But it seems silly to have two packages with such similar functionality. Would it make sense for some of this to be incorporated into StructArrays?

@piever
Copy link
Collaborator

piever commented Dec 14, 2020

Performance improvements are always welcome, so I guess it'd be good to see where the timing difference comes from, and see if the same trick can be used here. These things are a bit tricky though, because when the named tuples get too large, compile times can be an issue.

Are you benchmarking the released version of StructArrays or the last commit? There have been some unreleased performance improvements for the nested case (in #141), so master branch should be faster than the released version.

@cscherrer
Copy link

Good point. Some extra compile time is ok if you can make it up in performance of the compiled code, but there's certainly a tradeoff to keep in mind.

I ran benchmarks are on the master branch. I think I may be getting some nice speedups from my leaf_setter function. Say you have a TupleArray with this schema:

julia> schema(ta)
(a = Array{Float64,1}, b = (c = (Array{Float64,1}, Array{Float64,1}), d = Array{Float64,1}))

Then leaf_setter gives you a function to reconstruct this from the leaves:

julia> NestedTuples.leaf_setter(getfield(ta, :data))
function = (##256, ##257, ##258, ##259;) -> begin
    begin
        (a = var"##256", b = (c = (var"##257", var"##258"), d = var"##259"))
    end
end

So you can do

julia> NestedTuples.leaf_setter(getfield(ta, :data))(1,2,3,4)
(a = 1, b = (c = (2, 3), d = 4))

I use this as a foundation for most other functions.

@piever
Copy link
Collaborator

piever commented Dec 15, 2020

The constructor performance may have been fixed by #162 (just in time :) ). I guess it'd be useful to see what happens when you use a StructArray of tuples, because it looks like you are comparing with a TupleArray, so basically you'd do (r(), (r(), r()), ...) rather than (a = r(), b = (c = r(), d = r()), ...).

I suspect that setindex has been optimized for the nested case (it "unrolls" the recursion doing the "code generation" part of the generated function), but getindex could maybe do better. For now it uses this simple recursive building block here.

setindex relies on foreachfield: it be great if it can be further optimized as many functions use it, but it should already be quite good.

Unfortunately, this is a "low-level" package, so it really should not take extra dependencies, but if there is a smart way to improve performance without adding deps, that would be great. My best guess is that get_ith can use the same technique as foreachfield and "unroll the recursion". (The extra complication is that whatever new method is used, it should be compatible with CUDA kernels, where StructArrays can at the moment be used. Unfortunately, there are no automated tests for that, but when there is a concrete proposal we can manually check with the code snippet from #114 (comment).)

@tpapp
Copy link

tpapp commented Apr 12, 2024

Reviving this issue, it is my impression from some simple benchmarks that this feature works fine and is performant and non-allocating for views, which is already great. Eg

julia> using StructArrays, BenchmarkTools

julia> sa = StructArray([(x = i, yz = (y = 2*i, z = 3*i)) for i in 1:5000];
       unwrap = t -> t <: NamedTuple);

julia> f(sa) = sum(sa.yz.z)
f (generic function with 1 method)

julia> @benchmark f($sa)
BenchmarkTools.Trial: 10000 samples with 234 evaluations.
 Range (min  max):  318.333 ns  431.923 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     324.372 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   325.627 ns ±   7.200 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

        ▂▆▃▁▅█▄ ▁▃                                               
  ▁▁▄▆▆▆███████████▅▆▇▅▄▄▄▃▃▃▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  318 ns           Histogram: frequency by time          349 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

So my understanding is that the only thing that needs to be done to resolve the issue is documentation. It is in the docstring, but should be mentioned in the text of the manual. (If this is the case, I am happy to do a simple docs PR to this great package).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants