-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce startup time in loop-through mode by 80%-90% #1747
Conversation
Also starting to lazy-load I plan on looking into that once this PR has landed. Edit: I couldn't resist, so I did a quick copy-paste job to see what difference it would make, and it reduces startup time to 5.8 ms on my system. Now the bottleneck is constructing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice achievement, great job!
This looks great - thank you very much! Just for my understanding: The use of I can confirm the benchmark results.
I thought it would also be good to make sure that normal syntax set loading is not (significantly) slower, even if I wouldn't expect it to be. This can also be confirmed:
|
Yes, the use of When it is time to look over the public API and make breaking API changes, we could perhaps make it necessary for clients to handle failure to load the cache. Not sure it's worth the complexity though... I suggest we wait and see how much a problem my Thanks for the code reviews. I will try to merge this shortly. I just need some more time to explore how the code behaves when the cache is corrupt. |
So, the PR in its current form introduces a behavioral change when Step-by-step
git master behavior:Bat silently falls back to using assets (both themes and syntaxes) from the binary. PR behavior:Bat panics:
Alternative approachIt is straightforward to also make In terms of performance with a fully valid cache and with no cache, it doesn't matter much what route we take. Here are the numbers, where I compare "this PR" with "this PR + above commit", with and without a cache, and with and without loop-through mode:
With cache:
Without cache:
I'm starting to love hyperfine more and more btw, such a great tool! :D You really have a talent for coming up with useful tools, David :) Anyway, the question is basically: Do you think we need to worry about people poking around in the cache like this? (I currently lean more towards “no, we don’t”.) |
Hm. I usually try to invest a lot of time to get rid of possible |
Thank you for the feedback! If you are interested, I encourage you to learn about Warmup runs and prepare commands which can be really useful to compare warm disk cache with cold disk cache. In this scenario here, it could actually be really useful to compare the two, as we are actually loading additional files from disk. And they might not be in the (disk) cache, if a user runs |
As far as I know, the panic will only happen in the scenario I described. But I do understand your concerns, so I have decided to go the extra mile and incorporate the extra change into this PR. Thanks for pushing me in the right direction 👍 (That will also be a good opportunity for me to play around with Btw, I have noticed you seem to prefer |
I have used this in the past as well. I didn't know |
For bonus points: hyperfine 'sleep 0.1' 'sleep 0.2' --export-markdown >(xclip) |
To enable robust and user-friendly support for lazy-loading, we need variants of get_syntax_set() and syntaxes() that can fail (see discussion about panics in sharkdp#1747). This commit deprecates old public syntaxes() and introduces a failable version called get_syntaxes().
Or rather, introduce new versions of these methods and deprecate the old ones. This is preparation to enable robust and user-friendly support for lazy-loading. With lazy-loading, we don't know if the SyntaxSet is valid until after we try to use it, so wherever we try to use it, we need to return a Result. See discussion about panics in sharkdp#1747.
Or rather, introduce new versions of these methods and deprecate the old ones. This is preparation to enable robust and user-friendly support for lazy-loading. With lazy-loading, we don't know if the SyntaxSet is valid until after we try to use it, so wherever we try to use it, we need to return a Result. See discussion about panics in #1747.
They are just a way to get access to data embedded in the binary, so they don't conceptually belong inside HighlightingAssets. This has the nice side effect of getting HighlightingAssets::from_cache() and ::from_binary(), that are highly related, next to each other.
6797854
to
dabff4e
Compare
I have now rebased this PR on top of the merged PRs #1758, #1755 and #1755, and here is updated performance numbers and behavior. First, let's compare startup time between current git master bat and bat from this PR:
And we can confirm that the PR bat is still significantly faster in loop-through mode, and same as before in normal mode. Now let's use the new bat in this PR to compare performance with and without custom assets (that are otherwise byte-by-byte identical):
And as expected, it practically does not matter if custom or integrated assets are used. Finally, let's now see what happens if the cache is corrupt. First, let's keep
Instead of panicing, bat now explains that the cached syntax set could not be parsed. And if we remove the file completely:
bat now says that the file can not be found. Note that this is changed behavior compared to before, where we silently used integrated assets if the cached assets were corrupt. But I think the new behavior is better. If the cache is corrupt, the user probably wants to know. If we remove the cache completely, then of course integrated assets are used:
Would be great if you could take a renewed look on this PR before I merge. |
dabff4e
to
2582b99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great -thank you!
Note that this is changed behavior compared to before, where we silently used integrated assets if the cached assets were corrupt. But I think the new behavior is better. If the cache is corrupt, the user probably wants to know.
I completely agree 👍
Instead of 100 ms - 50 ms, startup takes 10 ms - 5 ms. HighlightingAssets::get_syntax_set() is never called when e.g. piping the bat output to a file (see Config::loop_through), so by loading the SyntaxSet only when needed, we radically improve startup time when it is not needed.
2582b99
to
0386c01
Compare
CI problems are unrelated. Merged. (:tada: ) |
HighlightingAssets::get_syntax_set()
is never called when e.g. piping thebat
output to a file (seeConfig::loop_through
), so by loading theSyntaxSet
only when needed, we radically improve startup time when it is not needed.On my low-end machine, I get the following numbers when doing
time bat tests/examples/multiline.txt > /tmp/output.txt
:git master
git master + this PR
We can also benchmark with
hyperfine
, in which case we don't need to redirect the output manually, because the internal redirection ofhyperfine
is enough to trigger loop-through mode:hyperfine --export-markdown /dev/tty 'bat tests/examples/multiline.txt'
git master
bat tests/examples/multiline.txt
git master + this PR
bat tests/examples/multiline.txt
So on my system, the speedup amounts to (1-11.2/96.1)*100 ~ 88%.
Lazy-loading was one of the key aspects of my prototype on how to improve the startup speed of bat, and this PR is one step on the journey to improve startup speed in general.
I have done sanity checking of this change, and all regression tests pass, so this PR should not be far off production-quality code. But don't hesitate to give criticism when doing code review :) And if you think my approach is completely off track, I would gladly like to know!
I also want to clarify that I think this is a good move long term, even if we start to load partial syntax sets for improved performance. The ability to load the full syntax set is still useful, both for fallback purposes, and for implementing things like
—list-languages