-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] resurrecting in place ntoh
#112
Conversation
This comment has been minimized.
This comment has been minimized.
I know it probably doesn't get much better but this is very messy :( |
actually: #before
(14304 allocations: 1.82 GiB)
#after
(16241 allocations: 1.26 GiB) I think this means we shouldn't do this, the number of allocation increases probably caused slow down. And the difference isn't as big as the 2x in the |
I agree it’s not pretty. I’ll keep this an open wip for now since at least it works. If we get inspiration for making it better and truly in place (?) that would be great :) |
I also think that this needs a bit more time |
Brainstorming a bit more. What about a thin wrapper type? Instead of returning a Ref{UInt8} everywhere and storing it in the LazyBranch to prevent the underlying data from struct Wrapper{T} <: AbstractVector{T}
x::Vector{T}
ref::Ref{Vector{UInt8}}
end
Base.@propagate_inbounds Base.getindex(w::Wrapper{T}, ind::Int) where T = w.x[ind]
Base.size(w::Wrapper) = size(w.x) And then with minimal changes, we eliminate the materialization associated with reinterpret. function LazyBranch(f::ROOTFile, b::Union{TBranch,TBranchElement})
T, J = auto_T_JaggT(f, b; customstructs=f.customstructs)
T = (T === Vector{Bool} ? BitVector : T)
- _buffer = T[]
+ _buffer = Wrapper{eltype(T)}([], Ref(UInt8[]))
if J != Nojagg function interped_data(rawdata, rawoffsets, ::Type{T}, ::Type{J}) where {T, J<:JaggType}
if J === Nojagg
- return ntoh.(reinterpret(T, rawdata))
+ p = convert(Ptr{eltype(T)}, pointer(rawdata))
+ w = unsafe_wrap(Array, p, length(rawdata) ÷ sizeof(eltype(T)))
+ w .= ntoh.(w)
+ return Wrapper(w, Ref(rawdata)) julia> @btime sum(tf.nMuon) # before
306.325 ms (1794 allocations: 469.67 MiB)
julia> @btime sum(tf.nMuon) # after
206.946 ms (1716 allocations: 234.91 MiB) I think it's not too hard to generalize to VoV since it can wrap any AbstractVector. I.e., julia> using ArraysOfArrays
julia> Wrapper([1,2,3,4],Ref(UInt8[]));
julia> VectorOfVectors(w, [1,2,5])
2-element VectorOfVectors{Int64, UnROOT.Wrapper{Int64}, Vector{Int64}, Vector{Tuple{}}}:
[1]
[2, 3, 4] |
I wonder if @oschulz has any opinion. In fact, it would be nice to have a variation of |
what happens if we make our own |
I'm not sure I've understood all the implications here - are you looking for somethings like a |
the problem is we have an original byte array |
But why can't we use |
because
|
I wonder why ReinterpretArray is performing that much worse, compared to our custom wrappers. Well, probably can't be helped right now. In that case I would recommend to use a |
JuliaLang/julia#42227 (comment) on Julia master, |
It might be that zlib decompression + the calculation is washing out the true benchmark |
Resurrection of #101 (which means some copy-pasting from @Moelf ;) ). Basically, we add a
inplace::Bool
tointerped_data
which toggles between doingntoh.(reinterpret(...))
(copy) or an inplace-variant, both returning the same type which means it's still type stable.But with a small modification,
basketarray()
's return type depends on the boolean. Probably this can be fixed with a barrier function to discard theRef{UInt8[]}
if called withinplace=false
(the default.Unit tests all pass.
before
after
Performance is pretty much the same but with less total allocated memory. If I profile
bar(t)
I see there's a materialization. I thought this should be in place??