Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc: Introduce Strict Version Hashes #12533

Merged
merged 3 commits into from
Feb 28, 2014
Merged

Conversation

alexcrichton
Copy link
Member

These hashes are used to detect changes to upstream crates and generate errors which mention that crates possibly need recompilation.

More details can be found in the respective commit messages below. This change is also accompanied with a much needed refactoring of some of the crate loading code to focus more on crate ids instead of name/version pairs.

Closes #12601


let hash = hash::sip::hash(krate);
Svh {
hash: str::from_utf8_owned(~[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::iter::range_step(0, 64, 4).map(|i| hex(hash >> i) as char).collect()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh clever

@metajack
Copy link
Contributor

Related #10207

@alexcrichton
Copy link
Member Author

We ended up having some non-determinism in our AST through two sources I found which made the stage1-generated and stage2-generated libsyntax libraries have different hashes. If these two libraries are different, it means all of the syntax extension tests start failing. This is obviously a hazard, but it wasn't too bad to fix.

  1. Command line --cfg flags are located in the CrateConfig inside of a Crate. This meant that different --cfg flags in different orderings or amounts would generate different hashes. I chose to manually alter how hashing the crate works inside of Svh::calculate in order to fix this problem. This sadly still has the side effect of --cfg foo is not equal to --cfg bar --cfg foo. The latter case creates one more identifier, meaning that all ident symbols are off-by-one. Thankfully our build process always passes the same number of --cfg flags, so we should be good for now...
  2. format!("{a}{b}", a=foo(), b=bar()) was expanded in fashion where the order of execution of a and b was nondeterministic. This was simply because it was generated by iterating a hashmap, so the last commit now moves to using a vector directly to get a deterministic ordering.

After implementing these two fixes, I'm seeing stable version hashes between stage1/stage2. This is obviously still quite brittle, I believe, because things like the env! syntax extension will affect compilation (I believe).

For now though, the tests seem to be passing. At least it helped uncover one real bug!

The previous code passed around a {name,version} pair everywhere, but this is
better expressed as a CrateId. This patch changes these paths to store and pass
around crate ids instead of these pairs of name/version. This also prepares the
code to change the type of hash that is stored in crates.
This new SVH is used to uniquely identify all crates as a snapshot in time of
their ABI/API/publicly reachable state. This current calculation is just a hash
of the entire crate's AST. This is obviously incorrect, but it is currently the
reality for today.

This change threads through the new Svh structure which originates from crate
dependencies. The concept of crate id hash is preserved to provide efficient
matching on filenames for crate loading. The inspected hash once crate metadata
is opened has been changed to use the new Svh.

The goal of this hash is to identify when upstream crates have changed but
downstream crates have not been recompiled. This will prevent the def-id drift
problem where upstream crates were recompiled, thereby changing their metadata,
but downstream crates were not recompiled.

In the future this hash can be expanded to exclude contents of the AST like doc
comments, but limitations in the compiler prevent this change from being made at
this time.

Closes rust-lang#10207
Previously, format!("{a}{b}", a=foo(), b=bar()) has foo() and bar() run in a
nondeterminisc order. This is clearly a non-desirable property, so this commit
uses iteration over a list instead of iteration over a hash map to provide
deterministic code generation of these format arguments.
bors added a commit that referenced this pull request Feb 28, 2014
These hashes are used to detect changes to upstream crates and generate errors which mention that crates possibly need recompilation.

More details can be found in the respective commit messages below. This change is also accompanied with a much needed refactoring of some of the crate loading code to focus more on crate ids instead of name/version pairs.

Closes #12601
@bors bors closed this Feb 28, 2014
@bors bors merged commit 017c504 into rust-lang:master Feb 28, 2014
@alexcrichton alexcrichton deleted the svh branch March 1, 2014 08:38
@huonw huonw mentioned this pull request Aug 4, 2014
flip1995 pushed a commit to flip1995/rust that referenced this pull request Apr 4, 2024
Remove unused dep `tester`

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better error message for out of date dependencies?
4 participants