-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: ideas to improve startup time #15945
Comments
Maybe it's possible to get rid of CoreFoundation and CoreServices as well? They're pretty big frameworks that in turn pull in a bunch of additional frameworks like libobjc, ICU, etc. In case anyone's wondering, libiconv is pulled in by the libc crate and only on apple platforms. |
This is the I had previously tried storing the cache information in a single file, but found that slow... that said, I was reusing some of our existing file system caching code to do this and that created directory structures (so it was create_dir_all-ing all the time). Maybe just hashing the specifier and using that for the file name, then storing it in a single flat directory might be faster? Or perhaps something could be done to make the sqlite initialization faster? Perhaps something with the sqlite pragma. I'm not sure. |
@Divy. i see ~23ms startup time on linux. also on gitpod is around 30ms running more modern linux and in some kind of VM. I am writing up benchmark stuff at moment so i will see if i can find any other improvements we could make here on linux side. |
Re. "Move internal JS off of V8 heap. Replace v8::String::new with v8::String::new_external_onebyte_static" Just flagging that this may cause issues if we have any unicode characters in our internal modules. We should write a test for it if/when we implement to be sure. |
On my system (a consumer laptop - core i5), the minimal JS runtime based on deno-core boostraps in ~8ms and latest deno release in ~24ms. both running in a privileged container as root to reduce system overhead. DenoRunjsVast majority of system time seems to be GC/Heap based so I would think any reduction in amount of code loaded/memory consumed at startup would reduce that a little, but not significantly. Are there modules we could lazy load on demand or does snapshotting preclude that? Deno
Runjs
I've also attached two flamegraphs captured with highest resolution i could (41Khz) for both. I will take a look in more detail at this and the SQLite stuff over weekend. DenoRunjs |
Looks like something slowed down startup time in |
yup. getting an accurate startup time is not an easy thing to do! i think if we are in ballpark of 10-15ms for a runtime with as much in the box as Deno that's really good. in the real world it will be very rare to be running a single script with no imports etc. and in a VM/container like scenario there are other tricks you can do to reduce this overhead. |
There's probably a bigger win here in looking at some optimizations we can do around SQLite and any edges we can shave off in the JS module caching - we should do a benchmark with a reasonably large complex codebase with lots of imports. I'll see if i can set something up. If anyone has any suggestions for a good project to use let me know. |
Closing since most of these optimizations were applied and we are now bottlenecked by snapshot deserialization and bootstrap JS. More work to do on that front before startup time can be optimal. |
Results (after applying below optimizations)
1.5x improvement
runjs
is a barebones rusty_v8 CLI for baseline.target/release/deno
is Deno from https://github.com/littledivy/deno/tree/startup_timedeno
is Deno 1.25.3node
is Node.js 18.8.0Potential optimizations
Deno
main
:clap::App::clone
. (minor) perf(cli): avoidclap::App::clone
#15951opt-level = 3
. (major) perf(cli): use -O3 instead of -Oz #15952Drop
handler for CliMainWorker.drop_in_place::<CliMainWorker>
. (major) perf(cli): early exit in run_command #15953v8::String::new
withv8::String::new_external_onebyte_static
. (minor)deno_fetch::create_http_client
. (major)LZ4_decompress_safe
(major)canonicalize_path
if config file does not exist #15957initialize_ops
(minor) perf(core): use single ObjectTemplate for ops ininitialize_ops
#15959macOS:
dyld
(major)Dependency on CoreGraphics & Metal slows down dyld. These come from webgpu, ideally we should lazy load them.
Without webgpu:
With webgpu:
Additionally we depend on
Security.framework
for hyper-tls integration. This could also be lazy loaded in some way.After applying all of that:
There are maybe some optimizations that can be made in
deno_graph::crate_graph
for the main module.Rest of the time is spent in:
MainWorker::execute_script
v8::Context::FromSnapshot
v8::Isolate::InitWithSnapshot
See https://github.com/littledivy/deno/tree/startup_time
The text was updated successfully, but these errors were encountered: