-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various changes to ChaChaRng
#243
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Just curious, how does ChaCha with 8 or 12 rounds compare to HC128 and ISAAC in terms of performance?
src/prng/chacha.rs
Outdated
// https://tools.ietf.org/html/draft-nir-cfrg-chacha20-poly1305-04 | ||
let seed = [0u8; 32]; | ||
let mut rng: ChaChaRng = SeedableRng::from_seed(seed); | ||
rng.set_counter(0, 2u64.to_be()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user needs to call .to_be()
on the indexes to get portable results? That is counter-intuitive for a counter. Are you sure it's correct, given that actually set_counter
converts each u64
to two u32
s via bit-shift? I suspect it's not. (In fact, it's different from your example in the documentation.)
Think we should replace set_counter
with something easier to use correctly? Would it be better to use four u32
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from the test vector, which sets a 96-bit nonce to 00 00 00 00 00 00 00 00 00 00 00 02
. I am not sure I got it right with the .to_be()
call. Probably should just not add that test, because in the comment I claimed we do not support using it for a nonce.
Think we should replace set_counter with something easier to use correctly? Would it be better to use four u32s?
If we of think ChaCha in terms of an RNG and not a stream cipher, an u128
counter is best. And two u64
s an ok alternative. I don't think four u32
s are handy to set a counter.
If we really want to set a nonce it should take a byte array, to match from_seed
.
Just thought of another reason not to support setting a nonce: we still use it as part of a counter, and not as a constant value. So if you run the RNG long enough, eventually the counter will tick over and increment the nonce., producing 'incorrect' results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then probably that test should use:
let counter = 2u64;
(counter >> 32) | (counter << 32)
— but I don't understand why the test passes as-is (it sounds like state[15]
should be set to 2u32
).
To be honest, I'm not really sure where this function would be useful, if at all. Perhaps we should also add:
fn set_nonce(&mut self, counter: u64, nonce: [u8; 8])
So if you run the RNG long enough, eventually the counter will tick over and increment the nonce., producing 'incorrect' results.
The only correct behaviour would be to stop with an error, which conflicts with our primary usage (as a good RNG). If used to encrypt/decrypt a longer byte stream, I guess it has to be up to external code to handle this part correctly (e.g. split into chunks, or just continue encrypting with a note on compatibility).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened an issue: dhardy#86
For now just try shifting the nonce as I suggested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then probably that test should use:
let counter = 2u64; (counter >> 32) | (counter << 32)
— but I don't understand why the test passes as-is (it sounds like
state[15]
should be set to2u32
).
Your code misses the endian troubles... But my 2u64.to_be()
was wrong also because it does not get converted on BE. rng.set_counter(0, 2u64 << 56);
works. But I am quite happy to see the test gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A slice [0u8, 0, 0, 2]
is the same as 0x0200_0000u32
in little-endian, the format we standardised upon inside the RNG. Together with the u32
s that are in opposite order the value is 0x0200_0000_0000_0000u64
.
src/prng/chacha.rs
Outdated
/// | ||
/// [1]: D. J. Bernstein, [*ChaCha, a variant of | ||
/// Salsa20*](https://cr.yp.to/chacha.html) | ||
/// ChaCha uses add-rotate-xor (ARX) operations as basis. These are safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"as its basis"
src/prng/chacha.rs
Outdated
/// | ||
/// With the ChaCha algorithm it is possible to choose the number of rounds the | ||
/// core algorithm should run. By default `ChaChaRng` is created as ChaCha20, | ||
/// with means 20 rounds. The number of rounds is a tradeoff between performance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"which means"
src/prng/chacha.rs
Outdated
/// With the ChaCha algorithm it is possible to choose the number of rounds the | ||
/// core algorithm should run. By default `ChaChaRng` is created as ChaCha20, | ||
/// with means 20 rounds. The number of rounds is a tradeoff between performance | ||
/// an security, 8 rounds are considered the minimum to be secure. A different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and security
src/prng/chacha.rs
Outdated
/// core algorithm should run. By default `ChaChaRng` is created as ChaCha20, | ||
/// with means 20 rounds. The number of rounds is a tradeoff between performance | ||
/// an security, 8 rounds are considered the minimum to be secure. A different | ||
/// number of rounds can be set with [`set_rounds`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set using set_rounds, or via
Good comments, thank you.
Still slowest, but not bad at all (with 8 rounds):
|
benches/generators.rs
Outdated
gen_uint_new!(gen_u64_std, u64, StdRng); | ||
gen_uint_new!(gen_u64_os, u64, OsRng); | ||
|
||
// Do not test JitterRng like the others by running it RAND_BENCH_N times per, | ||
// measurement, because it is way to slow. Only run it once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's time you stopped confusing 'to' and 'too'!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I will try too not make to many mistakes with them anymore. Starting tomorrow 😄.
src/prng/chacha.rs
Outdated
} | ||
|
||
/// Refill the internal output buffer (`self.buffer`) | ||
fn update(&mut self) { | ||
core(&mut self.buffer, &self.state); | ||
// For some reason extracting this part into a seperate function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate
I tried using the code from @PeterReid's implementation of ChaCha, integrating it was super easy. Results on stable:
Results on nightly with
Compared with our current implementation:
The result with SIMD can be improved again with ±3% by making the state also I am not sure if this change is worth it at the moment, because it is only faster on nightly. But nice to know that our implementation is actually pretty decent already. |
Think I'll go ahead and merge manually. It's clear from the Travis failures that GitHub messed up the merge :-( |
This PR is based on top of #233.
Parts of the changes here come from my attempts with a with
BlockRngCore
trait, like #242. And better to finish this seperately, so all functional changes and changes to the documentation can be reviewed better.The
init
function contained useful documentation that was not visible becauseinit
was not a public function. So I moved that to the documentation of ChaChaRng. Also I wrote some more in a similar style as HC-128.I added some tests and better documentation for
set_counter
. Trying to use it to set a nonce is difficult, considering the trouble with endianness and different standards specifying different schemes for the nonce and counter. So I documented it to be not a supported use ofset_rounds
.It turned out to be easy to make the number of rounds the core ChaCha algorithm runs configurable, just the addition of a
set_rounds
method.And I stumbled on a way to improve the performance, by using an
usize
for the number of rounds and keeping the core algorithm a seperate function. No idea why it works though.Benchmarks before:
Benchmarks after this and #242:
Finally remembered why it is a good idea to wait with generating results until the first use: it makes things like
ChaChaRng::new().set_counter(500)
possible.cc @burdges, because I know you have an interest in ChaCha.