-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bootstrap: include bootstrapped Environment in builtin snapshot #32984
Conversation
This comment has been minimized.
This comment has been minimized.
I removed https://chromium-review.googlesource.com/c/v8/v8/+/2143983 in #32761 and this time the performance difference between the two prototype disappeared - I guess the performance difference should come with rehashing, then (which is the price we will have to pay either way, for security reasons) - because once it's rehashable, everything (not just the additional maps/sets) gets rehashed.. So the performance of the two approaches should be comparable.
|
c620a7c
to
1baff44
Compare
1baff44
to
0889ce8
Compare
0889ce8
to
872b502
Compare
872b502
to
dd350b5
Compare
https://chromium-review.googlesource.com/c/v8/v8/+/2143983 landed in the upstream, so this should work now. I'll open a separate PR for the V8 cherry picks. @addaleax I removed a bunch of stuff not essential for the builtin snapshot (in particular, there's no need to take care of the BaseObject or any other embedder fields other than the Env pointer in Context for now), and split the snapshotting to smaller steps so that they are easy to revert / change. I think this is the minimal code we need to get the builtin snapshot shipped. With {
// -- native_modules begins --
{
"buffer",
"events",
"internal/assert",
"internal/async_hooks",
"internal/bootstrap/loaders",
"internal/bootstrap/node",
"internal/bootstrap/switches/does_own_process_state",
"internal/bootstrap/switches/is_main_thread",
"internal/buffer",
"internal/console/constructor",
"internal/console/global",
"internal/constants",
"internal/encoding",
"internal/errors",
"internal/fixed_queue",
"internal/linkedlist",
"internal/priority_queue",
"internal/process/execution",
"internal/process/per_thread",
"internal/process/promises",
"internal/process/signal",
"internal/process/task_queues",
"internal/process/warning",
"internal/querystring",
"internal/timers",
"internal/url",
"internal/util",
"internal/util/debuglog",
"internal/util/inspect",
"internal/util/inspector",
"internal/util/types",
"internal/validators",
"path",
"timers",
},
// -- native_modules ends --
// -- async_hooks begins --
{
0, // async_ids_stack
1, // fields
2, // async_id_fields
3, // execution_async_resources
},
// -- async_hooks ends --
{ 5 }, // tick_info
{ 4 }, // immediate_info
// -- performance_state begins --
{
6, // root
7, // milestones
8, // observers
},
// -- performance_state ends --
9, // stream_base_state
10, // should_abort_on_uncaught_toggle
// -- persistent_templates begins --
{
{ "async_wrap_ctor_template", 0, 283 },
{ "async_wrap_object_ctor_template", 1, 284 },
{ "binding_data_ctor_template", 2, 285 },
{ "i18n_converter_template", 14, 286 },
{ "promise_wrap_template", 18, 287 },
},
// persistent_templates ends --
// -- persistent_values begins --
{
{ "async_hooks_after_function", 0, 11 },
{ "async_hooks_before_function", 1, 12 },
{ "async_hooks_binding", 2, 13 },
{ "async_hooks_destroy_function", 3, 14 },
{ "async_hooks_init_function", 4, 15 },
{ "async_hooks_promise_resolve_function", 5, 16 },
{ "buffer_prototype_object", 6, 17 },
{ "enhance_fatal_stack_after_inspector", 10, 18 },
{ "enhance_fatal_stack_before_inspector", 11, 19 },
{ "internal_binding_loader", 26, 20 },
{ "immediate_callback_function", 27, 21 },
{ "inspector_console_extension_installer", 28, 22 },
{ "native_module_require", 30, 23 },
{ "prepare_stack_trace_callback", 33, 24 },
{ "process_object", 34, 25 },
{ "primordials", 35, 26 },
{ "promise_reject_callback", 36, 27 },
{ "tick_callback_function", 39, 28 },
{ "timers_callback_function", 40, 29 },
{ "trace_category_state_function", 42, 30 },
{ "url_constructor_function", 44, 31 },
},
// -- persistent_values ends --
32, // context
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make my objections here explicit:
- This PR does not extend well to userland snapshots. I think I’ve made it clear that that’s the big issue here for me, and I don’t see how this PR has changed in any way to address that.
- The added value of the intermediate representation objects is unclear, and they seem like they add a bunch of unnecessary complexity.
- I think the need to explicitly keep a list of all external reference entries is a code smell that we should get rid off.
Can you elaborate on specifically where this makes user land snapshot unfeasible? This does what's necessary for builtin snapshots, and leave the user land snapshot as a TODO - the embdder field serialization can be added later when we do have embdder fields in the snapshot, and code can be added to convert the immediate representation between binary blobs instead of code snippets, I don't think any choices done here would make the user snapshot unfeasible. I can start a separate branch for a proof-of-concept user land snapshot based on this, but at the mean time, I would like to see things land one step at at time - the snapshot effort had been very gradual, and I had been trying to focus on what's essential for the time being in my commits without adding too much machinery that may or may not get in the way instead of helping for the next step.
For me they help clarifying "the schema in which these objects are serialized" and additionally, you can compare what's in the intermediate representation and what's in the actual object to find out what's not going to be serialized. They also provide a side-effect-free deserialization phase before we could unfold them into live objects. I see the snapshot as part of the program configuration, and IMO the deserialization of the configuration into meaningful constructs should be better kept side-effect-free and state-free, and separate from the steps that you actually use them to configure the system. I think the complexities worth it in this case, and they are not too much.
Are you talking about the list of bindings, or the list of pointer registrations in each binding? |
I don’t think it makes userland snapshotting unfeasible, and I didn’t claim that that is the case. My point is that the static serialization and deserialization here will require a non-trivial amount of extra work for implementing userland snapshotting, which is a drawback that #32761 does not have.
Okay, I guess I’m okay with simply disagreeing here. This isn’t the main objection that I’m having anyway.
The former of the two. I don’t see any issues with the way the individual pointers are registered. |
Ah, sorry for the misunderstanding
What do you mean by "static serialization and deserialization"? Emitting C++ snippets? That's not intended to extend to user land snapshotting, it's just something that user land snapshotting can build upon (while loosing "what you see is what you get" of the snapshot data since we wouldn't know what we'd see in user snapshots), but even it does (just to provide another option of bundling snapshots inside an executable) can you elaborate on what makes you think this would make the extra work non-trivial? (e.g. is there a specific use case?) I have given some thought into what a user land snapshot prototype would look like and so far I don't see where this would be significant, as far as I can tell the possible schema of the snapshot should be pretty deterministic and limited.
I guess I will just have to disagree here, I find it less easy to follow if the external reference registration works significantly different from how |
BTW this will still be blocked while I working on a reland of https://chromium-review.googlesource.com/c/v8/v8/+/2192657 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am disheartened by how everything went here, including the review process on #32761, the way in which this resulted in competition and duplicate effort rather than collaboration, and, to be honest, the fact that the TSC as a whole doesn’t even care enough to weigh in more than an absolutely minimal way.
@joyeecheung, I’m not sure if you feel the same way, but I think this has been frustrating enough for both of us, and I don’t really want to add any more frustration to that, not for you and certainly not for myself. I’ll close #32761 and clear the way for this PR so that we can hopefully move on in some way, even if this PR doesn’t make the tradeoffs that make more sense to me.
Before this moves forward I'd like to have a tsc discussion about the two prs. I've put it on the next @nodejs/tsc agenda |
This don't land cleanly on |
So that it's possible to create an Environment not yet attached to any V8 context. We'll use this to deserialize the V8 context before attaching an Environment to it. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Add an ExternalReferenceRegistry class for registering static external references. To register the external JS to C++ references created in a binding (e.g. when a FunctionTemplate is created): - Add the binding name (same as the id used for `internalBinding()` and `NODE_MODULE_CONTEXT_AWARE_INTERNAL`) to `EXTERNAL_REFERENCE_BINDING_LIST` in `src/node_external_reference.h`. - In the file where the binding is implemented, create a registration function to register the static C++ references (e.g. the C++ functions in `v8::FunctionCallback` associated with the function templates), like this: ```c++ void RegisterExternalReferences( ExternalReferenceRegistry* registry) { registry->Register(cpp_func_1); } ``` - At the end of the file where `NODE_MODULE_CONTEXT_AWARE_INTERNAL` is also usually called, register the registration function with ``` NODE_MODULE_EXTERNAL_REFERENCE(binding_name, RegisterExternalReferences); ``` PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Pass the flags down to node_mksnapshot so that we can use them when generating the snapshot (e.g. to debug or enable V8 flags) PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
This includes the initial Environment (without running bootstrap scripts) into the builtin snapshot PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Keep track of snapshotted modules in JS land, and move bootstrap switches into StartExecution() so that they are not included into part of the environment-independent bootstrap process. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
This runs `lib/internal/bootstrap/loaders.js` before creating the builtin snapshot and deserialize the loaders from the snapshot in deserialization mode. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
The connection between the JS land zero fill toggle and the C++ one in the NodeArrayBufferAllocator gets lost if the toggle is deserialized from the snapshot, because V8 owns the underlying memory of this toggle. This resets the connection at pre-execution. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Fast APIs need to work with ArrayBuffers which we need to rebuild connections to after deserializing them from the snapshot. For now, postpone their creation until pre-execution to simplify the process. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Since V8 snapshot does not currently support instance member initialization, initialize them in ordianry class constructors for now so that these classes can be included in the snapshot. This may be reverted once https://bugs.chromium.org/p/v8/issues/detail?id=10704 is fixed and backported. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
This runs `lib/internal/bootstrap/node.js` before creating the builtin snapshot and deserialize the loaders from the snapshot in deserialization mode. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Since V8 snapshot does not currently support instance member initialization, initialize them in ordianry class constructors for now so that these classes can be included in the snapshot. This may be reverted once https://bugs.chromium.org/p/v8/issues/detail?id=10704 is fixed and backported. PR-URL: #32984 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Quick ping about backporting this. It looks like one of the commits made it onto 14.x, 3024927, which we may want to revert if we don't backport the entier thing |
Shouldn’t be necessary as there is no conflict there, only missing files |
@joyeecheung should this go to v |
The node/lib/internal/bootstrap/node.js Line 335 in d548f4d
|
This was the WIP I had been working on to give my updates in #17058 and promised to post in #26382 (comment) back in last December - I was not planning to open this until I get the issues mentioned in #17058 (comment) fixed in V8, but then #32761 came along with a significantly different design. So I rebased this against master and expanded the snapshot to include
bootstrap/node.js
for comparison, and post it here to demonstrate the alternative design decisions I suggested in #32761.Similar to the MVP approach taken by #27321, this now contains only the essential code for getting builtin snapshot shipped in Node.js core - I tried my best to split the snapshot integration into smaller commits that do this step by step. User land snapshot integration would need more yaks shaved to move forward due to the natural constraints of the snapshots, so I left a few comments in the places where code need to be expanded when we do implement user land snapshot integration. We'll need to give more thoughts take care of the uncertainty of the object graph brought by user land snapshotting - in particular, how to properly reset the connection between JS and C++ states, as well as the C++ states and the external system states.
This also includes some new ideas I got after the discussions in #32761.
I've recorded the background and analysis of different options for these design decisions in https://docs.google.com/document/d/15bu038I36oILq5t4Qju1sS2nKudVB6NSGWz00oD48Q8/edit?usp=sharing some of them are implemented here, some in #32761. I think it might be better to keep design discussions in the doc instead of in PRs - so that we focus more on the pros and cons of these options from a higher level. A lot of the decisions are influenced by the previous discussions and prototypes in #17058
cc @addaleax (thanks for the patience for going over the design options with me and the inspiration for some of the ideas here!)
src: split the main context initialization from Environemnt ctor
So that it's possible to create an Environment not yet attached
to any V8 context. We'll use this to deserialize the V8 context
before attaching an Environment to it.
src: add an ExternalReferenceRegistry class
Add an ExternalReferenceRegistry class for registering static
external references.
To register the external JS to C++ references created in a binding
(e.g. when a FunctionTemplate is created):
Add the binding name (same as the id used for
internalBinding()
and
NODE_MODULE_CONTEXT_AWARE_INTERNAL
) toEXTERNAL_REFERENCE_BINDING_LIST
insrc/node_external_reference.h
.In the file where the binding is implemented, create a registration
function to register the static C++ references (e.g. the C++
functions in
v8::FunctionCallback
associated with the functiontemplates), like this:
At the end of the file where
NODE_MODULE_CONTEXT_AWARE_INTERNAL
isalso usually called, register the registration function with
tools: enable Node.js command line flags in node_mksnapshot
Pass the flags down to node_mksnapshot so that we can use them
when generating the snapshot (e.g. to debug or enable V8 flags)
src: snapshot Environment upon instantiation
This includes the initial Environment (without running bootstrap
scripts) into the builtin snapshot
src: make code cache test work with snapshots
Keep track of snapshotted modules in JS land, and move
bootstrap switches into StartExecution() so that
they are not included into part of the environment-independent
bootstrap process.
src: snapshot loaders
This runs
lib/internal/bootstrap/loaders.js
before creatingthe builtin snapshot and deserialize the loaders from the
snapshot in deserialization mode.
src: reset zero fill toggle at pre-execution
The connection between the JS land zero fill toggle and the
C++ one in the NodeArrayBufferAllocator gets lost if the toggle
is deserialized from the snapshot, because V8 owns the underlying
memory of this toggle. This resets the connection at pre-execution.
bootstrap: build fast APIs in pre-execution
Fast APIs need to work with ArrayBuffers which we need
to rebuild connections to after deserializing them
from the snapshot. For now, postpone their creation
until pre-execution to simplify the process.
lib: initialize instance members in class constructors
Since V8 snapshot does not currently support instance member
initialization, initialize them in ordianry class constructors
for now so that these classes can be included in the snapshot.
This may be reverted once
https://bugs.chromium.org/p/v8/issues/detail?id=10704
is fixed and backported.
src: snapshot node
This runs
lib/internal/bootstrap/node.js
before creatingthe builtin snapshot and deserialize the loaders from the
snapshot in deserialization mode.
make -j4 test
(UNIX), orvcbuild test
(Windows) passes