Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of SmallRng #767

Closed
vks opened this issue Apr 8, 2019 · 8 comments
Closed

Future of SmallRng #767

vks opened this issue Apr 8, 2019 · 8 comments
Labels
B-API Breakage: API

Comments

@vks
Copy link
Collaborator

vks commented Apr 8, 2019

SmallRng is an RNG unsuitable for applications where unpredictable randomness is required, making it a poor default choice. There are currently two reasons to choose it over StdRng:

  1. Better performance. I think this advantage can be rendered obsolete by using a state-of-the-art SIMD implementation of ChaCha, which has comparable performance (Would this repo be open to a SIMD implementation of ChaCha20? #667).
  2. Smaller state size. However, ChaCha is only 8.5 times larger than StdRng, making it a suitable choice even for embedded applications. Remaining applications are massively parallel calculations, where the small state size might still be preferable, but this seems like a very niche use case that could be supported by a different crate.

Once we use a good SIMD implementation of ChaCha, there are few use cases remaining where SmallRng would be preferable. I think we could remove it, which has the following advantages:

  1. Less API and code in rand.
  2. Only secure RNG options in rand, requiring a more explicit opt-in to use a predictable RNG (by using an additional crate).

Note that the current implementation of StdRng is significantly slower and larger than SmallRng, so I don't think we should deprecate SmallRng just yet.

@vks vks added B-API Breakage: API T-RNG labels Apr 8, 2019
@dhardy
Copy link
Member

dhardy commented Apr 8, 2019

Well reasoned. As I understand, micro-benchmarks of RNGs (especially of simple ones like PCG) are not all that representative of real-world usage. Additionally, my understanding is that SIMD instructions can have varying performance depending on CPU, power provision, thermal limits and use of hyperthreading, so a SIMD CSPRNG which can win micro-benchmarks on modern x86_64 CPUs does not make this the fastest option for everyone.

All the same, I agree with the aim. Many users can switch to StdRng without issue; other users can easily switch to rand_pcg or another crate, getting more control over which PRNG they use.

@vks
Copy link
Collaborator Author

vks commented Apr 8, 2019

Benchmarks can be found here: https://bench.cr.yp.to/results-stream.html

Notably, benchmarks for generating 8 bytes are included, where SmallRng likely has an edge. (Note that our current choice, HC128, is usually the slowest (behind HC256) for generating 8 bytes.)

@dhardy
Copy link
Member

dhardy commented Apr 8, 2019

Encrypting an 8-byte message is quite a different application to, say, generating a boolean with f64 probability. Part of the attraction of simple RNG algorithms is that the optimiser may be able to inline the RNG code within its application, whereas block RNGs must generate a whole block in one step. So SIMD ChaCha20 doing well in one benchmark isn't sufficient evidence to recommend it for everything (even then, it appears 20-100 times slower for 8-byte messages than for long messages).

Not really sure what your point is though, since we seem to agree on what we should do? (The first step being to optimise our ChaCha impl.)

@vks
Copy link
Collaborator Author

vks commented Apr 8, 2019

even then, it appears 20-100 times slower for 8-byte messages than for long messages

We are always generating a 64 byte block, where the ChaCha slow-down is less than 16 times, according to the benchmarks.

Not really sure what your point is though, since we seem to agree on what we should do?

I guess the point is that we have to be careful about deprecating SmallRng with regard to next_u64 performance. We should probably advertise fill more and make it work with f64 and f32.

@dhardy
Copy link
Member

dhardy commented Apr 8, 2019

IIRC our advice on performance has always been roughly as follows: StdRng and thread_rng should be fast enough for most purposes; if their performance is not satisfactory then benchmark multiple PRNGs in your application. The point being, I don't think we should worry too much about speed when deprecating SmallRng.

Is your last point about SIMD float operations? #269 is probably relevant here, but even more so the SIMD implementations for Uniform. Potentially other distributions could get SIMD implementations too (but most are currently limited to f64 output only).

However, SIMD float operations are only useful operations for applications which can consume them; e.g. agent based sims & games would need agents to be batched and avoid most branching logic. This kind of optimisation wouldn't help everyone.

@vks
Copy link
Collaborator Author

vks commented Apr 10, 2019

There is another permutation that was designed to be smaller and faster for short messages than ChaCha: https://gimli.cr.yp.to

It uses 384 bytes of state, so it is only 3 times larger than SmallRng.

@dhardy
Copy link
Member

dhardy commented Jun 4, 2019

#792 introduced a feature-gate for SmallRng. I think we can now close this issue?

@vks
Copy link
Collaborator Author

vks commented Jun 4, 2019

Yes. Based on how much the feature is used, we can decide whether to keep it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API
Projects
None yet
Development

No branches or pull requests

2 participants