Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lazy Chain/Vcat for Tree #131

Merged
merged 5 commits into from
Oct 8, 2021
Merged

add lazy Chain/Vcat for Tree #131

merged 5 commits into from
Oct 8, 2021

Conversation

Moelf
Copy link
Member

@Moelf Moelf commented Oct 7, 2021

Closes #72

julia> const r = LazyTree(ROOTFile("./Run2012BC_DoubleMuParked_Muons.root"), "Events", ["nMuon", "Muon_phi"]);

julia> const two_r = vcat(r,r);

julia> @time sum(x->x.nMuon, r)
  1.130108 seconds (2.30 M allocations: 662.258 MiB, 4.51% gc time, 45.83% compilation time)
0x0000000008e67ad8

julia> @time sum(x->x.nMuon, two_r)
  1.806890 seconds (416.44 k allocations: 1.083 GiB, 2.06% gc time, 4.37% compilation time)
0x0000000011ccf5b0

julia> length(two_r) == 2*length(r)
true

julia> 0x0000000008e67ad8 *2 == 0x0000000011ccf5b0

julia> @time reduce(vcat, [r,r]);
  0.000570 seconds (79 allocations: 5.750 KiB)

julia> @time mapreduce(x->view(x, 1:10^6), vcat, [r,r]);
  0.083790 seconds (187.75 k allocations: 10.239 MiB, 14.86% gc time, 99.33% compilation time)

julia> @time mapreduce(x->view(x, 1:10^7), vcat, [r,r]);
  0.045763 seconds (173.21 k allocations: 9.556 MiB, 98.92% compilation time)

In no hurry to merge, please give it a try ;)

TODO:

  • : test

@codecov
Copy link

codecov bot commented Oct 7, 2021

Codecov Report

Merging #131 (90141c4) into master (cfacd31) will decrease coverage by 0.24%.
The diff coverage is 75.00%.

❗ Current head 90141c4 differs from pull request most recent head 0840c02. Consider uploading reports for the commit 0840c02 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #131      +/-   ##
==========================================
- Coverage   92.38%   92.13%   -0.25%     
==========================================
  Files          11       11              
  Lines        1418     1450      +32     
==========================================
+ Hits         1310     1336      +26     
- Misses        108      114       +6     
Impacted Files Coverage Δ
src/UnROOT.jl 100.00% <ø> (ø)
src/iteration.jl 89.26% <75.00%> (-0.56%) ⬇️
src/displays.jl 91.13% <0.00%> (-3.60%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfacd31...0840c02. Read the comment docs.

@Moelf
Copy link
Member Author

Moelf commented Oct 7, 2021

julia> @time let nmu=0
           for evt in r
               nmu += evt.nMuon
           end
           nmu
       end
  0.705541 seconds (11.37 k allocations: 544.327 MiB, 1.61% gc time)
149322456

julia> @time let nmu=0
           for evt in two_r
               nmu += evt.nMuon
           end
           nmu
       end
  1.532861 seconds (22.73 k allocations: 1.063 GiB, 1.81% gc time)
298644912

and jagged

julia> @time let nmu=0
           for evt in r
               nmu += length(evt.Muon_phi)
           end
           nmu
       end
  4.039776 seconds (18.39 k allocations: 2.967 GiB, 1.76% gc time)
149322456

julia> @time let nmu=0
           for evt in two_r
               nmu += length(evt.Muon_phi)
           end
           nmu
       end
  7.949935 seconds (36.78 k allocations: 5.934 GiB, 1.21% gc time)
298644912

@Moelf
Copy link
Member Author

Moelf commented Oct 7, 2021

@batch multi-threading works as expected (speed && correctness)

#avoid false sharing
julia> @time let nmu=zeros(Int, Threads.nthreads()*10)
           @batch for evt in two_r
               nmu[Threads.threadid()*10] += length(evt.Muon_phi)
           end
           sum(nmu)
       end
  2.356330 seconds (229.16 k allocations: 6.024 GiB, 1.22% gc time, 4.17% compilation time)
298644912

@aminnj
Copy link
Member

aminnj commented Oct 7, 2021

looks good to me in my limited testing. It's interesting to see how much functionality you can add with only 10-15 lines :)

@Moelf
Copy link
Member Author

Moelf commented Oct 7, 2021

for posterity, here's an attempt using ChainedVector from https://github.com/JuliaData/SentinelArrays.jl :

julia> function myvcat(t1::LazyTree, t2::LazyTree)
           c1 = columns(UnROOT.innertable(t1))
           c2 = columns(UnROOT.innertable(t2))
           LazyTree(
           Table(NamedTuple(k=>ChainedVector([c1[k], c2[k]]) for k in keys(c1)))
           )
       end

julia> const _blah = myvcat(r,r);

julia> const two_r = vcat(r,r); # this PR

julia> @time sum(x->length(x.Muon_phi), _blah)
  8.517592 seconds (96.67 k allocations: 5.937 GiB, 1.92% gc time, 0.77% compilation time)
298644912

julia> @time sum(x->length(x.Muon_phi), two_r) # this PR
  7.866316 seconds (110.78 k allocations: 5.938 GiB, 1.98% gc time, 1.07% compilation time)
298644912

I also tried:

LazyTree(
           Table(TableOperations.joinpartitions(Tables.partitioner([c1, c2])))
           )

which is basically automatic ChainedVector and is similarly slower than the current PR

@Moelf Moelf merged commit 12e01ed into master Oct 8, 2021
@Moelf Moelf deleted the vcat_chain_tree branch October 8, 2021 01:02
Moelf added a commit to aminnj/UnROOT.jl that referenced this pull request Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TChain?
2 participants