Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple cache for ROOTFile and remove cache for TDirectory #273

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

Moelf
Copy link
Member

@Moelf Moelf commented Oct 10, 2023

fix #272
fix #269

I looked at what we're actually indexing on a ROOTFile, and found out that in the case of
reading NanoAOD, it's just indexing "Events" for ~1400 times, and everything else once e.g.
(Events/blah). So I decided to just use a naive dict like this.

In the following PRs I will add more robust tests so we can figure out why we had a LRU cache
for TDirectory and if we need one in the future, and also prevent future regression.


The code logic is basically rolling back to v0.10.5 but then do a naive cache dict manually

@Moelf Moelf force-pushed the ROOTFile_cache_rework branch from fac6198 to 751cf6d Compare October 10, 2023 11:58
@Moelf
Copy link
Member Author

Moelf commented Oct 10, 2023

@tamasgal you might want to fix the MD5 thing soon, apparently the internal API changed

@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Files Coverage Δ
src/bootstrap.jl 86.97% <100.00%> (-0.03%) ⬇️
src/root.jl 94.11% <93.33%> (-0.66%) ⬇️

... and 2 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

@Moelf Moelf merged commit 7f9eb14 into master Oct 10, 2023
6 of 9 checks passed
@Moelf Moelf deleted the ROOTFile_cache_rework branch October 10, 2023 12:07
@tamasgal
Copy link
Member

Oh, that MD5 comes out of nowhere. Thanks.

@grasph
Copy link
Member

grasph commented Oct 10, 2023

Hello Jerry

On my side, tt stills take >=20 s (22s) for the setup time with master (3611ee)

LazyTree display is taking a very long time (I interrupted before end)

@Moelf
Copy link
Member Author

Moelf commented Oct 10, 2023

On my side, tt stills take >=20 s (22s) for the setup time with master (3611ee)

do you have a sample file with not too many events? or just point me to an open data file maybe

LazyTree display is taking a very long time (I interrupted before end)

this is expected, because reading those branches for the first time is slow, and display needs to read first basket of each branch, displayed and last basket of each branch displayed

@grasph
Copy link
Member

grasph commented Oct 10, 2023

Thanks for looking at this. You can find a file with 10 events in https://cernbox.cern.ch/s/Vy6Yhz47vDaYiWa

@grasph
Copy link
Member

grasph commented Oct 10, 2023

Hello Jerry

On my side, tt stills take >=20 s (22s) for the setup time with master (3611ee)

LazyTree display is taking a very long time (I interrupted before end)

Clarification: with v0.10.16 setup time was 120s and it was v.10.15 it was 21s. So regression for the setup time is fixed. v0.10.15 and v0.10.16 didn't have the issue with the display.

Philippe.

@Moelf Moelf mentioned this pull request Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance for trees with a large number of branches LazyTree() hang regression in 0.10.16
3 participants