-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make crate hash stable and externally computable. #10593
Conversation
r? @brson This probably isn't quite ready to merge as a few tests are still failing. Some help figuring out why would be appreciated. |
Ok, I'll build locally and see if I can figure out what's wrong. Perhaps we should infer the pkgid from the filename if one isn't provided, maybe print a warning - otherwise this is going to break a lot of code. How does this interact with rustpkg's notion of package id's? |
It currently infers pkgid the same way it used to infer link.name. Or at least that was my intent. rustpkg currently injects link attributes in, so I suppose I'll need to modify rustpkg to inject pkgid attributes instead if they aren't provided. I also noticed that some tests use a link_name attribute (ie, they have both link.name and link_name), whose purpose I am not sure of. Perhaps that will need to be dealt with too. |
Oh, I see where it's inferred now. The |
crateresolve6 and 7 don't work because now the two aux crates end up with the same output name, so trying to differentiate their loading by their crateresolve8 is fixed by this patch:
I guess |
Why Sha1 instead of Sha2? The existing Sha1 code was being used by rustpkg as, I believe, an implementation detail that could easily be updated or modified without being visible to other applications. I would imagine that if the create hashes are using Sha1, its going to be quite a bit harder to change that to something else after Rust 1.0. I could be wrong about that, but, if I'm not, I think it would be more future proof to use something like Sha2 which is also widely available and much more secure. |
issue-6919 is fixed by changing the pkgid to Fixing up these test cases to pass we can probably merge this pretty much as-is, but I think we need to straighten out some of the semantics of crate resolution and the extern mod statement pretty soon. |
@DaGenix the hashing here isn't providing any security that I can think of, just a unique identifier. |
@DaGenix I don't really care what hash function it uses, but it should be something that ships in most default libraries. The entire point of it is to be computable easily by other build tooling not from the Rust world. We were using SipHash, but the Python standard library doesn't include this. Nor is there a shell command that will calculate the SipHash of arbitrary input. SHA1 is widely enough deployed that shasum and standard libraries generally have it. Even with this, determining the file name the rust compiler will output is nontrivial. Here's an example Makefile function that does it: |
@brson Whats the impact if the unique identifiers don't turn out to be unique? Accidental collisions are probably highly unlikely, but, intentional malicious collisions are already theoretically possible. According to Wikipedia ( http://en.wikipedia.org/wiki/SHA-1) there is a theoretical attack that could find a collision at a cost of roughly $2.7 million today. That's unrealistic today for anyone but a nation state, but, its only a matter of years before that becomes more feasible. If Rust becomes as popular as it has the potential to be, how much of a pain would it be to change the hash in 5-10 years? It could be done today with an hour or less of work. Why not use Sha2? |
There's no need for an attack on SHA1, as you can simply set your pkgid to the same as someone else's package and it will generate the same hash. This is equivalent to you naming your library "libc" or "libpng" or something and trying to get people to install it. I considered just bash64ing the pkgid but hashing seemed the best way to get the result down to a reasonable number of characters with low probability of accidental collision. The goal I am aiming for is to be better than the completely flat namespace of /usr/lib that exists now. |
re-r? @brson I addressed your comments (I think) and all tests pass now. |
Also, I'm happy to squash after you review. I thought this would make it easier for you to see the changes. |
Ok, so, I was thinking that the crate hash was actually hashing up its contents and so a collision might fool someone into using the wrong library. Anyway, as you point out, that's not the case. However, If the goal is absolute compatibility, its probably hard to do better than MD5, although MD5 is exceptionally insecure. Sha1 is almost as compatible, and almost as fast, but only somewhat more secure. I think its hard to find someone that recommends using Sha1 for a new application, regardless of how is being used. Sha256 or Sha512 is nearly as ubiquitous as MD5 or Sha1 and runs at a competitive speed (The Sha256 rust implementation that I wrote that used to be in libextra was faster than the current Sha1 implementation, IIRC) while being much more secure. You don't need to follow the sentence "I'm using Sha256" with an explanation of why its not a security problem like you do with MD5 or Sha1. So, Sha1 seems like it exists in an odd middle area of not really being better than MD5 or Sha256. I'm also a little concerned about moving Sha1 into the util module of rustc - even though this current use doesn't really have security implications, it feels like this is just encouraging a future use that might. I don't know how persuasive these argument have been. However, if you are not opposed, I'm happy to whip up a patch to use Sha256 instead of Sha1 since I don't know what the benefit of Sha1 is other than that it is currently in the Rust tree. |
I just checked and all the places I care about (CMake, Lua, Python, shell, Ruby) all have sha256. I'm fine with using that. |
Ok, cool. I'll get a patch together tomorrow for that. |
I put at patch in the sha256-pkg-id branch of https://github.com/DaGenix/rust.git. I also sent a pull request for it to your repository, since I wasn't sure what the best way to post it was. |
I am getting failures when running make check in the rustpkg tests and I have no idea why. I'm quite certain they aren't due to the changes I made since some of the tests that are failing are failing trying to find certain files that don't have any hashes appended to them. I think its some issue with how those tests are trying to find files that isn't playing well with my system. |
I'll investigate and see if I can track down the failures. |
I was doing a build in a subdirectory of the main source folder. I dunno if that was causing the problems, but, I'm trying a full rebuild from the root of the source directory to see if that works better. |
What system are you on and which tests failed? I'm mostly through check-stage1 with your patch applied and not seeing any failures yet. |
I'm using 64-bit Ubuntu 13.10. Here are the errors i'm getting after running
|
I put the full output here: https://gist.github.com/DaGenix/7695515 |
lgtm but needs a rebase. You might consider adding a flag to generate the sha so you don't have to depend on a local sha utility. |
@brson Rebased. I fixed up more test cases as well. The final commit also fixes a makefile dependency error I encountered. |
This replaces the link meta attributes with a pkgid attribute and uses a hash of this as the crate hash. This makes the crate hash computable by things other than the Rust compiler. It also switches the hash function ot SHA1 since that is much more likely to be available in shell, Python, etc than SipHash. Fixes rust-lang#10188, rust-lang#8523.
I apparently forgot to run make tidy and had two lines that were too long. Fixed. |
This replaces the link meta attributes with a pkgid attribute and uses a hash of this as the crate hash. This makes the crate hash computable by things other than the Rust compiler. It also switches the hash function ot SHA1 since that is much more likely to be available in shell, Python, etc than SipHash. Fixes #10188, #8523.
This change is causing a serious problem. Library names are now completely different. And I don't mean the hash. I mean the actual name on-disk. The library name on disk is If I go ahead and add a |
I've filed this same commentary as issue #10922. |
…dnet fix(needless_return): do not trigger on ambiguous match arms return If we see a case where match returns something other than `()`, we just skip `needless_return` lint in that case Should fix rust-lang#10546 Relevant Zulip Discussion: https://rust-lang.zulipchat.com/#narrow/stream/257328-clippy/topic/Issue.20.2310546 --- changelog: FP: [`needless_return`]: No longer lints match statements with incompatible branches [rust-lang#10593](rust-lang/rust-clippy#10593) <!-- changelog_checked -->
This replaces the link meta attributes with a pkgid attribute and uses a hash
of this as the crate hash. This makes the crate hash computable by things
other than the Rust compiler. It also switches the hash function ot SHA1 since
that is much more likely to be available in shell, Python, etc than SipHash.
Fixes #10188, #8523.