Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify array creation when not Offsetjagg #90

Merged
merged 4 commits into from
Sep 10, 2021

Conversation

aminnj
Copy link
Member

@aminnj aminnj commented Sep 10, 2021

When there's no 10 byte offset, VectorOfVectors can use the raw offsets almost directly. Basically all that's needed is

offset = convert(Vector{Int64}, rawoffsets)
offset .÷= _size
offset .+= 1

I got a 0-10% speedup on the uncompressed DoubleMu file with this change. If you know of faster replacement for those 3 lines, we could go even faster! 🚀 :)

@aminnj
Copy link
Member Author

aminnj commented Sep 10, 2021

@Moelf I should be able to get rid of offset = convert(Vector{Int64}, rawoffsets) and use the Int32 offsets due to JuliaArrays/ArraysOfArrays.jl@eca846a, but I still get complaints, even though my ArraysOfArrays is v"0.5.4"...

@JuliaHEP JuliaHEP deleted a comment from codecov bot Sep 10, 2021
@codecov

This comment has been minimized.

@aminnj
Copy link
Member Author

aminnj commented Sep 10, 2021

Thanks to @Moelf for the magic lines to get Int32 offsets everywhere. If there is any speedup, it's slight, but not negative at least.

@Moelf
Copy link
Member

Moelf commented Sep 10, 2021

https://github.com/tamasgal/UnROOT.jl/blob/65bc44d4af4cdc80a94fe7479017d6d46fa5e80b/src/custom.jl#L56

you also need to change this, it's alarming our test doesn't detect this

@aminnj
Copy link
Member Author

aminnj commented Sep 10, 2021

Test added separately in #92

@Moelf Moelf merged commit fe11279 into JuliaHEP:master Sep 10, 2021
@aminnj
Copy link
Member Author

aminnj commented Sep 10, 2021

Since

julia> Tuple{SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{Int64}}, true}, Vector{Int32}} <: Tuple{Vector{UInt8},Vector{Int32}}
false

that means we were incurring the non-trivial cost of convert() at the end of readbasketseek(). However, we can't just remove the type annotation because then there are downstream issues with the @view rawdata, so the @view is removed. Total allocations slightly decreases, but performance is basically the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants