-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster startup #163
base: master
Are you sure you want to change the base?
Faster startup #163
Conversation
A bit faster, but much faster possible:
|
Codecov Report
@@ Coverage Diff @@
## master #163 +/- ##
===========================================
- Coverage 96.99% 82.74% -14.26%
===========================================
Files 12 13 +1
Lines 333 394 +61
===========================================
+ Hits 323 326 +3
- Misses 10 68 +58
Continue to review full report at Codecov.
|
I was looking into the slowness of loading LibPQ.jl, and this package was a big offender. Compared to (as is on Julia 1.6 on default settings, without changing these packages):
It's seems like in your port, the "flexibility" (I assume new features) is costly, because of new features. I didn't look into if there's AbstractLogging.jl, but maybe it should exists, or how compatible the APIs are. If startup can't be improved (more) could choose-able logging be done (in e.g. LibPQ)? |
Yes, Memento provides more flexibility for fine-tuning your logging across larger applications which require hundreds of packages/modules. In our production systems, we often need to turn on debug logging for very specific packages/modules (or chains of them). AFAIK, none of the current solutions provide that kind of flexibility and because our applications run for long periods of time the startup cost is a reasonable compromise for us. I don't think disabling optimization is the right approach here because then you're just pushing that startup cost to the runtime. I think a better approach might be to look at what's changed in the Julia recently that might allow us to improve type stability, compilation time and performance overall, while maintaining the features that distinguish this package from some other alternatives. Alternatively, we could put some work into gradually adding the desired features from this package into base in some way... but that'll likely require more work, negotiations and compromises. |
I just assumed this code isn't speed-critical, since it's "just" logging, right? I/O bound? Maybe I overlooked something, at least -O0 isn't too slow, -O1 is also a middle-ground for you to consider. I realize you're a company (and maybe not too affected by the startup), I'm more concerned for your LibPQ.jl, where it (mostly, but not just, because of Memento), has slow startup (compared to wrapper drivers for other databases), maybe a bit annoying for many users. Any idea how much work it's to migrate from Memento for it? For my usecase, I only need trivial logging if any. |
Yes, for logs that are actually emitted the code would be I/O bound, but what about cases where the logs shouldn't be emitted based on log level or handler? Base logging avoids this problem with macros that turn your code into no-ops, while Memento uses lazy evaluation with more flexible filters that may not be known at compile time.
Memento can hook into the base logging with a few caveats.
If you'd still like to move forward with trying to make that change to LibPQ.jl then I'd recommend making an issue there and one of our maintainers for that package can decide if it's worth it. Moving forward here, I'd recommend that we open an issue to look into what's slowing down the load time and see if there's anything we can do about it more directly. I suspect that it might be tied to the |
FWIW, the reason |
Yep, looks like it's because we're actually running some code during loading. Without the
and with it commented out I get:
NOTE: These are first load times, but after precompilation. |
I would still time with lower optimization level (at least |
Okay, I think the better approach might be to use SnoopCompile.jl to generate a set of precompile statements that'll improve load times. Ideally, this'll push the issue to a package build/precompile time problem and we don't have to worry about losing runtime performance. I should have a PR shortly. |
Okay, looks like I may need to add some similar |
Alright, I wasn't sure how to fix TimeZones.jl directly, so I've just opted to cache the output of
vs
NOTE: It's probably better to look at the allocations rather than exact times here Apart from redesigning Memento's architecture I'm not sure how much better we can do because there's a certain amount that Memento needs to do at load time in terms of setting up the logger hierarchy. |
Alright, I've merged and tagged that PR if you want to give it a spin. |
The timing on my machine is a little noisy, but it seems I get just as fast (min) loading speed with -O1 as -O0, under half a sec better on v1.1.1. Feel free to test the speed with my PR, or rather modified to -O1, or just close this PR, as you've done a good job already. I could finally install, but not after some problems (at Genie or other package, still unresolved): GenieFramework/Genie.jl#269 There's some new stuff on Julia master to help with package loading, and my PR here could use it, but I'm not sure it has landed, at least will be in 1.6 earliest (or some 1.6-DEV?). |
Feel free to just close issue, as on Julia 1.6 master (w/o PR), not too bad (or merge with (updated PR for) less conservative -O1, which is probably fast at runtime):
FYI: This is on latest Memento@v1.1.1 but for its master seems a bit slower... |
I've asked around for someone to clarify what the optimization levels actually do. Once I have an idea of what this is likely to change during runtime then I'll write some benchmarks and compare. It could be that setting it to 0 doesn't make a difference 99% of the time. |
Jeff wrote that it only changes LLVM optimization pass. Julia however does some optimization before it gets to that. Julia has also --inline={yes|no} and even -O0 doesn't imply no inlining. |
There are more per-module options coming in 1.6: JuliaLang/julia#37041 |
Yes, thanks, I saw that, but lost the issue number, and was actually going to find it back. FYI: The stuff in the PR already (and I assume, also with those new options), are not as effective as I would like. If you put it in your module, one or more dependencies may be the real reason for the slowness of loading, I think in your case, and adding this stuff in your module has I think no effect on sub-modules (also unclear you would want that). That's why I recently added to JLLs this option, and they can have huge trees of dependencies of JLL and I added to those to (70 PRs), and that still seemed like a failed experiment (vs. starting with the global option). I didn't however change all the modules that did have the JLLs as dependencies. |
Hmmm, I don't think
That seems like a pretty minimal difference. I think a better approach going forward might be to re-architect Memento to focus more on zero-cost abstractions. For example:
|
I agree. However, I noticed on brand-new master down to:
I tried to get the former time, with the new capabilities, then I got lower (because of additional new "infer" option):
I'm not saying we should use these new options, even if they would work better, I think we could get closer to those minimal times with rearchitecting. |
JSON.jl's startup delay isn't too bad (nor is JSON2 or JSON3, that load almost as fast), but I believe all the complexity of those packages relates to JSON parsing, with writing JSON rather trivial (so we could do without a package and still write JSON?). Do we even need JSON logging? I see https://github.com/JuliaLang/TOML.jl (now a standard lib) loads faster. I prefer to read TOML, and there are code filters to convert; and also online: https://pseitz.github.io/toml-to-json-online-converter/ |
I'm having difficulty parsing your first sentence, but yes, JSON logging is pretty basic and is a consistent option for various logging services (e.g., Graylog, cloudwatchlogs, sumologic)... which is why it is included. We could choose to have a separate JSONMementoFormatter.jl package, but that seemed a little verbose give that JSON is a minimal dependency and it is such a ubiquitous log format (vs the actual logging services). If you'd like to add a TOML formatter that's fine, but it is currently a less common format for various logging services. NOTE: The use case here is for logging services that can benefit from a structured log record format for indexing and quick searches, and not for human readability in a local file. |
I looked, and other package you're likely to use with, e.g. Genie and LibPQ (there an indirect dependency), use JSON.jl anyway, so not important to drop. [Assuming, as I believe, you pay the cost of of its startup only once.] I wouldn't add TOML unless you can get rid of JSON, which doesn't seem high priority. [Many PostgreSQL users will not use JSON, so some of the functionality, e.g. JSON could be lazy loaded, in effect, when not used weakening my above argument.] [I rewrote my other comment for clarity.] |
No description provided.