Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect crate metadata from corruption via SHA-256 hash #87896

Closed
wants to merge 2 commits into from

Conversation

Aaron1011
Copy link
Member

We now compute a SHA-256 of the raw (encoded) crate metadata,
and append it to the final crate metadata that we store on disk.
When we load the metadata, we compute the hash from the metadata
blob, and verify that matches the hash stored at the end of the
blob.

This allows us to detect on-disk corruption of the metadata file,
which might later cause a build failure much later in compilation.

If anyone is manually editing crate metadata on-disk,
they will need to re-compute and modify the hash at
the end of the blob.

This will allow us to determine whether or not crate metadata
corruption is causing some of the unusual incr-comp failures
we've been seeing. The incremental compilation data itself
will be hashed in a follow-up PR.

@rust-highfive
Copy link
Collaborator

r? @cjgillot

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 9, 2021
@Aaron1011
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 9, 2021
@bors
Copy link
Contributor

bors commented Aug 9, 2021

⌛ Trying commit 297e23fc026279541fea4940689d2d4c041b215a with merge af0cbac7c33f68cd7cc4b0c0d627193cdb60b5f8...

@bors
Copy link
Contributor

bors commented Aug 9, 2021

☀️ Try build successful - checks-actions
Build commit: af0cbac7c33f68cd7cc4b0c0d627193cdb60b5f8 (af0cbac7c33f68cd7cc4b0c0d627193cdb60b5f8)

@rust-timer
Copy link
Collaborator

Queued af0cbac7c33f68cd7cc4b0c0d627193cdb60b5f8 with parent ae90dcf, future comparison URL.

@rust-log-analyzer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (af0cbac7c33f68cd7cc4b0c0d627193cdb60b5f8): comparison url.

Summary: This change led to significant regressions 😿 in compiler performance.

  • Very large regression in instruction counts (up to 264.2% on incr-unchanged builds of helloworld-check)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 10, 2021
We now compute a SHA-256 of the raw (encoded) crate metadata,
and append it to the final crate metadata that we store on disk.
When we load the metadata, we compute the hash from the metadata
blob, and verify that matches the hash stored at the end of the
blob.

This allows us to detect on-disk corruption of the metadata file,
which might later cause a build failure much later in compilation.

If anyone is manually editing crate metadata on-disk,
they will need to re-compute and modify the hash at
the end of the blob.

This will allow us to determine whether or not crate metadata
corruption is causing some of the unusual incr-comp failures
we've been seeing. The incremental compilation data itself
will be hashed in a follow-up PR.
@cjgillot
Copy link
Contributor

I am not really convinced why we need to do that, and in which cases this feature would be useful.
Should there be a way to activate this check, with a -Z option for instance?
Is the perf regression due to the initial hash computation or to the hash checking?
Can it be mitigated somehow?

@inquisitivecrystal inquisitivecrystal added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 24, 2021
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 13, 2021
@bors
Copy link
Contributor

bors commented Sep 18, 2021

☔ The latest upstream changes (presumably #82183) made this pull request unmergeable. Please resolve the merge conflicts.

@cjgillot cjgillot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 19, 2021
@bjorn3
Copy link
Member

bjorn3 commented Sep 19, 2021

Maybe use a faster non-cryptographically secure hash or maybe even just a crc or other kind of checksum?

Is the perf regression due to the initial hash computation or to the hash checking?

Hash checking mostly. Probably because previously the crate metadata was lazily loaded. Now it has to be eagerly loaded for the hash check.

@michaelwoerister
Copy link
Member

https://github.com/BLAKE3-team/BLAKE3 might be interesting for something like this.
Also sccache has some interesting notes on performance: https://github.com/mozilla/sccache/blob/3f318a8675e4c3de4f5e8ab2d086189f2ae5f5cf/src/util.rs#L70-L83

@camelid camelid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 8, 2021
@JohnCSimon JohnCSimon removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 24, 2021
@JohnCSimon JohnCSimon added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 24, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 13, 2021
@camelid camelid added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 10, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 30, 2022
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 27, 2022
@JohnCSimon
Copy link
Member

JohnCSimon commented Apr 11, 2022

Triage:
@Aaron1011 what's the status of this PR? It's sat idle for months with merge conflicts.

@JohnCSimon
Copy link
Member

@Aaron1011
Ping from triage: I'm closing this due to inactivity, Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this Nov 27, 2022
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Nov 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.