Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy instead of reintertrept() for leaf columns #279

Merged
merged 3 commits into from
Oct 10, 2023
Merged

Conversation

Moelf
Copy link
Member

@Moelf Moelf commented Oct 10, 2023

fix #278

From 143s down to 6.4s :

Test Summary:        | Pass  Total  Time
RNTuple multicluster |    5      5  6.4s

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

All modified lines are covered by tests ✅

Files Coverage Δ
src/RNTuple/fieldcolumn_reading.jl 93.51% <100.00%> (-0.24%) ⬇️
src/RNTuple/footer.jl 96.96% <100.00%> (ø)

📢 Thoughts on this report? Let us know!.

@Moelf
Copy link
Member Author

Moelf commented Oct 10, 2023

this is weird:

Setup

using UnROOT
const f1 = UnROOT.samplefile("RNTuple/test_ntuple_int_multicluster.root")

function main()
    df = LazyTree(f1, "ntuple")
    unique(df.one_integers[1:end÷2]) == [2]
    unique(df.one_integers[end÷2+1:end]) == [1]
    df.one_integers[1] == 2
    df.one_integers[end] == 1
    length(df.one_integers) == 50000000 * 2
end

1.10

julia> @time df.one_integers[1];
  0.841748 seconds (582.83 k allocations: 419.404 MiB, 2.23% gc time, 84.14% compilation time)

julia> @time df.one_integers[end];
  0.146485 seconds (18.34 k allocations: 383.358 MiB, 15.27% gc time)

julia> @time main();
  3.735243 seconds (214.63 k allocations: 1.693 GiB, 6.12% gc time, 2.69% compilation time)

1.9

julia> @time df.one_integers[1];
  4.564458 seconds (100.66 M allocations: 1.906 GiB, 2.01% gc time, 16.69% compilation time)

julia> @time df.one_integers[end];
  3.774830 seconds (100.02 M allocations: 1.864 GiB, 2.86% gc time)

julia> @time main();
 18.655775 seconds (400.26 M allocations: 7.656 GiB, 2.59% gc time, 0.62% compilation time)

@Moelf
Copy link
Member Author

Moelf commented Oct 10, 2023

# before
julia> @time main(df)
 19.891766 seconds (600.07 M allocations: 13.605 GiB, 5.07% gc time)
true

# after
julia> @time main(df)
  3.216152 seconds (73.35 k allocations: 1.684 GiB)
true

ugh, we really need to integrate some kind of Cthulu testing

@Moelf Moelf merged commit dea08ef into main Oct 10, 2023
6 of 7 checks passed
@Moelf Moelf mentioned this pull request Oct 10, 2023
@Moelf Moelf deleted the rntupe_read_field_slow branch May 19, 2024 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RNTuple reading extremely slow
1 participant