-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bootstrap: make snapshot reproducible #50983
Conversation
Review requested:
|
great work! keep it up! |
6df2ede
to
8dc86b6
Compare
c92d201
to
2d911be
Compare
2d911be
to
e0b6197
Compare
Somehow this is reproducible locally (macOS + x64 & arm64), and in the CI, on FreeBSD and SmartOS, but not on others. There are still 3-4 words of differences in the snapshot on those platforms. I'll investigate. |
e0b6197
to
9c8ab5e
Compare
57b8d25
to
f03a7ba
Compare
Split the context data callback part to #53217 - at least that makes the context data slots serialized with some non-gibberish (empty values), otherwise V8 might run into issues serializing the gibberish pointer values. |
Is this PR still useful after #53217 being splitted? |
Yes, in this PR are still some utilities for checking reproducibility and a test for that (but it's not currently passing because a few other bytes are still not reproducible, I will work on those bytes here). |
Some discoveries:
After fixing the two on Linux I am getting reproducible snapshots. Will update the branch after I clean it up.. |
For pointer values in the context data, we need to return non-empty data in the serializer so that V8 does not serialize them verbatim, making the snapshot unreproducible.
To make the snapshots reproducible, this patch updates the size of a few types and adds some static assertions to ensure that there are no padding in the memcpy-ed structs.
- Print offsets in blob serializer - Add a special node:generate_default_snapshot ID to generate the built-in snapshot. - Improve logging - Add a test to check the reproducibilty of the snapshot
For pointer values in the context data, we need to return non-empty data in the serializer so that V8 does not serialize them verbatim, making the snapshot unreproducible. PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
Landed in 922feb1...1872167 |
To make the snapshots reproducible, this patch updates the size of a few types and adds some static assertions to ensure that there are no padding in the memcpy-ed structs. PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
- Print offsets in blob serializer - Add a special node:generate_default_snapshot ID to generate the built-in snapshot. - Improve logging - Add a test to check the reproducibilty of the snapshot PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
written_total += w.Write<EnvSerializeInfo>(env_info); | ||
w.Debug("Write code_cache\n"); | ||
w.Debug("0x%x: Write CodeCacheInfo\n", w.sink.size()); | ||
written_total += w.WriteVector<builtins::CodeCacheInfo>(code_cache); | ||
w.Debug("SnapshotData::ToBlob() Wrote %d bytes\n", written_total); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth check that DCHECK_EQ(written_total, w.sink.size())
.
For pointer values in the context data, we need to return non-empty data in the serializer so that V8 does not serialize them verbatim, making the snapshot unreproducible. PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To make the snapshots reproducible, this patch updates the size of a few types and adds some static assertions to ensure that there are no padding in the memcpy-ed structs. PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
- Print offsets in blob serializer - Add a special node:generate_default_snapshot ID to generate the built-in snapshot. - Improve logging - Add a test to check the reproducibilty of the snapshot PR-URL: #50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
For pointer values in the context data, we need to return non-empty data in the serializer so that V8 does not serialize them verbatim, making the snapshot unreproducible. PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To make the snapshots reproducible, this patch updates the size of a few types and adds some static assertions to ensure that there are no padding in the memcpy-ed structs. PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
- Print offsets in blob serializer - Add a special node:generate_default_snapshot ID to generate the built-in snapshot. - Improve logging - Add a test to check the reproducibilty of the snapshot PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
For pointer values in the context data, we need to return non-empty data in the serializer so that V8 does not serialize them verbatim, making the snapshot unreproducible. PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To make the snapshots reproducible, this patch updates the size of a few types and adds some static assertions to ensure that there are no padding in the memcpy-ed structs. PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
- Print offsets in blob serializer - Add a special node:generate_default_snapshot ID to generate the built-in snapshot. - Improve logging - Add a test to check the reproducibilty of the snapshot PR-URL: nodejs#50983 Refs: nodejs/build#3043 Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To prevent padding from making the snapshot unreproducible, zero-initialize the data that are copied into the snapshot so that the padding copied are all zeros. This is better than enlarging the enums to align the fields since it doesn't make the snapshot bigger than necessary, and it removes the need of using static assertions to ensure alignment. PR-URL: #53563 Refs: #50983 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To prevent padding from making the snapshot unreproducible, zero-initialize the data that are copied into the snapshot so that the padding copied are all zeros. This is better than enlarging the enums to align the fields since it doesn't make the snapshot bigger than necessary, and it removes the need of using static assertions to ensure alignment. PR-URL: #53563 Refs: #50983 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To prevent padding from making the snapshot unreproducible, zero-initialize the data that are copied into the snapshot so that the padding copied are all zeros. This is better than enlarging the enums to align the fields since it doesn't make the snapshot bigger than necessary, and it removes the need of using static assertions to ensure alignment. PR-URL: #53563 Refs: #50983 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To prevent padding from making the snapshot unreproducible, zero-initialize the data that are copied into the snapshot so that the padding copied are all zeros. This is better than enlarging the enums to align the fields since it doesn't make the snapshot bigger than necessary, and it removes the need of using static assertions to ensure alignment. PR-URL: #53563 Refs: #50983 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
To prevent padding from making the snapshot unreproducible, zero-initialize the data that are copied into the snapshot so that the padding copied are all zeros. This is better than enlarging the enums to align the fields since it doesn't make the snapshot bigger than necessary, and it removes the need of using static assertions to ensure alignment. PR-URL: #53563 Refs: #50983 Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Tobias Nießen <tniessen@tnie.de> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Daniel Lemire <daniel@lemire.me> Reviewed-By: James M Snell <jasnell@gmail.com>
src: return non-empty data in context data serializer
For pointer values in the context data, we need to return
non-empty data in the serializer so that V8 does not
serialize them verbatim, making the snapshot unreproducible.
src: make sure that memcpy-ed structs in snapshot have no padding
To make the snapshots reproducible, this patch updates the size
of a few types and adds some static assertions to ensure that
there are no padding in the memcpy-ed structs.
src: add utilities to help debugging reproducibility of snapshots
the built-in snapshot.
Refs: nodejs/build#3043