Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster hash algorithm? #108

Closed
alexcrichton opened this issue Apr 21, 2017 · 8 comments
Closed

Faster hash algorithm? #108

alexcrichton opened this issue Apr 21, 2017 · 8 comments

Comments

@alexcrichton
Copy link
Contributor

Running sccache against a fully cached LLVM build on S3 I've found that 82% of the runtime of the server is consumed in sha1 hashing. That's quite a lot! Knowing C++ it is indeed hashing megabytes and megabytes of output...

I wonder if perhaps other hash algorithms have been considered? So far some pieces I've learned are:

  • The OpenSSL implementation of SHA1 is ~4x faster than the sha1 crate. (presumably due to fancy simd bits)
  • The ring implementation is the same speed as the sha1 crate (presumably because both are written in stable rust without simd)
  • The OpenSSL SHA256 implementation is ~2x faster than the sha1 crate
  • The ring implementation is also 2x faster than the sha1 crate
  • Various blake2b crates can get to about 3x faster than the sha1 crate
  • I'm sure md5 can be even faster still!

Are there various thoughts on the hashing algorithm here? I'd be wary of using OpenSSL due to its difficult-to-get-working Windows support. I think ring could be a good option but it's not that much faster in sha256 mode. Maybe blake2b is fast enough? (and I think still written in Rust).

@luser
Copy link
Contributor

luser commented Apr 21, 2017

I thought we had an issue open on this already, but I guess not! @glandium and I definitely discussed it in the past. Pulling in OpenSSL doesn't sound fantastic, but I'm open to any ideas here. If someone wanted to just pull the OpenSSL assembly implementations into a crate so we could use that that would be one way to go (but involve a fair bit of work). Simply switching to ring's sha1 or a blake2b implementation would be fine as well. We're not really wedded to sha1, it's just what the original Python sccache implementation used, and one of my goals with the initial port was to produce identical hashes as a sanity check. I did ask @briansmith about making rings digest implementations available standalone, but he was not interested in that due to the maintenance headaches (understandably so).

@briansmith
Copy link

briansmith commented Apr 21, 2017

SHA-512 is faster than SHA-256 on 64-bit platforms because SHA-512 uses 64-bit integers whereas SHA-1 and SHA-256 use 32-bit integers.

I assume that your files are generally 8KB or larger. Here are my numbers for ring on my laptop (Haswell):

test digest::sha1::_8192 ... bench: 35,484 ns/iter (+/- 974) = 230 MB/s
test digest::sha1::_8192 ... bench: 35,463 ns/iter (+/- 806) = 231 MB/s

test digest::sha512::_8192 ... bench: 12,111 ns/iter (+/- 323) = 676 MB/s
test digest::sha512::_8192 ... bench: 12,249 ns/iter (+/- 194) = 668 MB/s

So ring's SHA-512 is about almost exactly 3x the speed of its SHA-1. Also I think I have a prototype of ring's SHA-512 that is about 50% higher throughput (about 1000 MB/s in this benchmark) sitting around in some branch.

@briansmith
Copy link

BTW, this is exactly the kind of application which should not be using SHA-1 due to its poor collision resistance.

@alexcrichton
Copy link
Contributor Author

Aha yep! SHA-512 is 50% faster than SHA-256 (in ring locally) and is on par with blake2b locally.

@luser how'd you feel about a dependency on ring to use SHA-512?

@luser
Copy link
Contributor

luser commented Apr 21, 2017

That seems totally reasonable. Getting a big speedup plus extra collision resistance sounds like a win-win.

alexcrichton added a commit to alexcrichton/sccache that referenced this issue Apr 21, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
@briansmith
Copy link

Note: SHA-384 is exactly the same algorithm as SHA-512 (so same speed), but with a different initialization vector, and truncated to 384 bits. So if you were happy with SHA-1 you probably would be happier with SHA-384 than SHA-512.

Also there is something called SHA-512/256 which is is exactly like SHA-512 (same speed), but with a different initialization fector, and truncated to 256 bits. ring doesn't support that yet, but I wouldn't mind adding it if it would be useful.

(FYI: SHA-256 is very similar to SHA-512, but 32-bits instead of 64-bits, and with a half-sized block size, and of course with a different initialization vector. SHA-512/256 was invented to get SHA-256-sized digests at SHA-512 speed, after 64-bit computers became ubiquitous.)

@alexcrichton
Copy link
Contributor Author

Oh interesting, thanks for the info @briansmith!

If they're all the same speed I think the SHA-512 version would work fine for us as we're just storing in S3 or on the local filesystem anyway, so there aren't that many constraints on the length of the hash generated.

@est31
Copy link

est31 commented Apr 27, 2017

Related: rust-lang/rust#41215 , but the situations are a bit different as sccache apparently needs crypto strength hashes.

alexcrichton added a commit to alexcrichton/sccache that referenced this issue Apr 28, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
alexcrichton added a commit to alexcrichton/sccache that referenced this issue May 11, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
alexcrichton added a commit to alexcrichton/sccache that referenced this issue May 12, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
alexcrichton added a commit to alexcrichton/sccache that referenced this issue May 14, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
alexcrichton added a commit to alexcrichton/sccache that referenced this issue May 15, 2017
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!

Closes mozilla#108
Xanewok pushed a commit to Xanewok/sccache that referenced this issue Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants