-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster and Better weak generators #59
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Hi! Xorshift+ sounds like a sensible addition. As far as PCG is concerned, I’d implement it in an external crate (there’s a few on cargo already).
The “for the platform” is a pretty hard thing to do – we don’t want to start benchmarking all the xorshifts at runtime, nor do we want to make weak_rng to return XorShift on some platforms and XorShift+ on others. Therefore we have to rely on eyeballing and pick an algorithm that seems to be fast(est) on as many platforms as possible (our current pick being XorShift). Changing the rng returned by |
@nagisa by "fastest algorithm available for the platform." I meant, that 64bits platform should default to Xorshift128+ as it's better overall, both quality and speed wise. I agree that it'll break some code (not much though) but I'd argue that it's both worthy and expected as per semver. |
#[inline] | ||
fn next_u64(&mut self) -> u64 { | ||
const MULTIPLIER: w64 = w(6364136223846793005); | ||
const INCREMENT: w64 = w(1442695040888963407); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indentation (and a good amount of indentation below) is a little off.
r? @huonw, you may have more opinions on this than I
While I agree we don't want to determine the fastest one at runtime, I do think that we should change the return type of For now I'd avoid changing |
Erm, maybe we're using different notation, but I thought the generator implemented here was xorshift128. |
@huonw yeah, I was refering to the algorithm name of the XorShiftRng. We should consider renaming both to XorShift64Rng and XorShift128PRng though. |
Hm, everything I can remember reading calls our thing xorshift128, e.g. https://en.wikipedia.org/wiki/Xorshift , with xorshift64 referring to a different variant with 64-bits of internal state. (I'm trying to clarify to make sure we're all on the same page.) |
8a8cdcc
to
42d60b9
Compare
Updated the PR code to
@huonw yaaa, somehow 64 sticked into my head. Thank you. |
@nagisa There's one by EricIO and one by codahale, which look to be minimal ports of Pcg. I just started a full port of the entire C++ reference implementation, mostly as an exercise in porting over C++ code using template mixins. I aim to also port the crazy "extended" generators that allow for generators with arbitrarily large periods... y'know, because it's there. Any input in relation to this crate is welcome. |
@alexcrichton With respect to "saving up" breaking changes, I've gotten into the habit of having a "breaks" branch that PRs with breaking changes merge into. I rebase it to keep it up to date with master, and then merge it in when I'm ready to push a new major release. It's nice since you can still accept breaking PRs instead of leaving them in limbo for an unknown period of time. |
Just to make it clearer, I removed all breaking changes on this. We can integrate the Weak abstraction (to hide the implementation) in a subsequent pr/version. |
Hello again, any updates on this? I believe this is a good addition as it is. It'd be specially useful if we hid the implementation of weak_rng() in a WeakRng struct, this way we could hide the fastest implementation behind it. But this is a breaking change and we must make a decision. Anyway I believe this should be merged to encourage further improvements down the road (like the previous mentioned WeakRng struct). |
I would like to stick my oar in: Please do NOT provide any "weak" algorithms. Once they are in, they can't be got rid of because RNGs need to be reproducible even decades later; but bad randomness is an endless well of subtle bugs, often security-critical bugs. I would hesitate even to provide ISAAC. |
@alexcrichton I think maybe you should weigh in here? |
For what it's worth, I don't care at all about whether we provide any weak RNGs, as long as they're marked as such, but I do care a lot that we don't encourage userspace CSPRNGs. |
@pcwalton Why do you want to discourage the use of userspace CSPRNGs? I ask largely because people have been making noises about adding ChaCha-based |
I think the idea with avoiding userspace CSPRNGs is that you are less likely to keep them adequately (re)seeded, their state hidden from side channels, and their implementation up to date with contemporary assumptions about what constitutes "secure" as you are if you just always call the kernel's variant. I mostly agree with this sentiment and, as the person who added ISAAC in the first place, wouldn't necessarily miss it if it were removed and the default set to It does kinda break my heart to have a weak RNG in a stdlib at any level, certainly if it winds up turning into the default simply to push back on FUD from a bunch of lousy benchmarks. It feels like a bad trade to me, better to leave in an external crate and called |
I agree with @pcwalton that I don't mind having these so long as they're documented as not secure at all, from time to time I've enjoyed having a "not fancy" RNG so I think there's real use in having them. That being said I'll still defer to @huonw as he's got a WIP design for a revamped rand crate. |
After chatting with some other RNG enthusiasts I'd advocate for removing non-crypto PRNGs from |
I'm with @bhickey here. A simple implementation of XorShift is all of three lines of code, and even if you mess up writing it it's basically irrelevant because you have no expectation of quality randomness from XorShift in the first place. (Not to cast aspersions on the implementation here, which looks much more complete and featureful--but I'd still prefer it live as a third-party crate.) I also think that having separate interfaces for simulation-worthy RNGs and crypto-worthy RNGs is worth investigating in third-party crates for now. |
I believe there's very little benefit in splitting the crate. This way we can have the same interfaces and extras like we have today. I'm all in for keeping secure PRNGs as the default though. |
Couldn't this split also be achieved by having two top-level modules within the crate: |
@arthurprs There's no reason the interface would need to change. If we forked the crate today @DanielKeep |
The last few times I've done randomness-intensive simulations I've used AES-CTR. With hardware acceleration (laptop with AES-NI) it's more than fast enough, and the peace of mind is great (and of course it's reproducible). So "simulations" versus "cryptography" may not be nuanced enough. |
@sorear I don't know if that is really a good enough reason to say that we might not need a fast simulation focused PRNG. When optimizing a game for diverse hardware I would rather have a 10 instruction decent PRNG then hope that whatever machine one of my users is on has the acceleration needed. Also can you confirm that it is reproducible across all hardware accelerated instances? and is that hardware implementation consistent with the fallback software instance? I agree that the discussion is nuanced but it seems to me that some people just need a "This is going to be stupidly fast in all situations" PRNG even if that means sacrificing crypotographic security. |
@robojeb To be clear, I'm not proposing to use cryptographic key-material RNGs for simulations. Those are not reproducible by design. However, stream ciphers do produce reproducible output, and are consistent between implementations (they have to be, because if the Intel AES-CTR implementation and the software reference implementation didn't produce exactly the same keystream bytes, your encrypted data wouldn't round-trip). |
@sorear that is a great point that totally slipped my mind. I think my main issue in this discussion is discoverability. Don't get me wrong I love the fact that cryptographically good number generators are easy to access. That is great for the rust community and they should be used when needed. I think though that it is equally usefull to the community to promote fast PRNG with adequate documentation to teach people when they are the right option. Now I don't know if that means putting them in |
Another thought I just had. Could good generators be tagged with a marker trait like |
I was going to argue that a SIMD ChaCha20 implementation would be much faster than our current implementation, and write SIMD versions of ChaCha20 and AES (AESNI) to include in your benchmarking tables above. However, it turns out that I'm way out of my depth for writing SIMD code in Rust right now, so I'll just offer http://bench.cr.yp.to/results-stream.html instead. I regard "weak reproducible RNGs" as a very niche application because the strong reproducible RNGs (stream ciphers) well implemented are so fast on recent mid-range hardware, but I don't have anything else to back that up with so I'll stop now. |
Just stopping by to remind readers of the points made above re: not having a CSPRNG in userspace. It's not terribly important -- cryptographically -- if something's fast if it's inadequately seeded or uses an algorithm that gets broken. The OS should provide the secure variant (linux even has a faster syscall for it) (And the insecure variant, if it exists at all, should be marked as such very clearly) |
@graydon It's not clear to me what exactly you are arguing for or against. It seems to me you are not arguing against userspace stream ciphers (I don't like the term CSPRNG here, since it kind of implies forward and backward security are a goal), but instead you are arguing against the overall API. That is, you are arguing against any generator being able to be seeded. Instead, that trait should be gone and essentially only But |
I'm arguing that since chacha is slower than pcg and xor, simulation-focused users won't want it. And that it's unwise to leave it lying around in case users who want cryptographic randomness pick it up in favour of osrng. I realize there are two classes of users here with different needs. I'm arguing it's best to address them separately. |
And I am arguing that simulation-focused users are best served with a generator that is guaranteed to actually be statistically indistinguishable from random. Maybe that's not ChaCha20; maybe it's AES-CTR, maybe it's ChaCha8, maybe it is something else (note that the performance of ChaCha on x86 is lacking, due to the absence of SIMD capabilities on Rust at the time of implementation). But regular generators, such as LCGs, Xorshift, and so on have repeatedly shown to be far from random, despite happening to pass some static (and rarely growing) set of standard statistical tests. I believe the distinction between use cases is best done through API, not primitive names. |
Be careful to differentiate LCGs from PCG. The latter have good statistical properties and are fast. I ... guess the topic is in "private email" now so unclear whether there's any use writing here. |
Just an update related to the PR algorithm choice. See http://v8project.blogspot.de/2015/12/theres-mathrandom-and-then-theres.html?m=1 |
Javascript engines (specifically wrt Math.rand) are in a strange position where they care neither about cryptographic security nor about reproducibility. I would expect a blessed rust-lang crate to pick one of those two concerns and do it well. Generators that provide low-quality randomness can live on crates.io (or you can copy/paste a trivial three-line xorshift implementation into your code). |
What I meant is that xorshift128+ is a great choice for the weak generator. I don't really get why so much resistance against this. 99,9% of time you just want a good distribution. And the default generator already is the secure one. |
Ah, I thought you were implying that we should so like JS engines do and make xorshift+ the default generator. In any case, I don't think that movement on behalf of JS engines addresses the prior objections to this PR wrt whether or not weak generators should have blessed implementations at all. In browser-land, a pure JS PRNG implementation is leaving a lot of performance on the table, which is why a native implementation is desirable. Rust doesn't have the same restrictions, and Cargo makes it trivial to pull in a faster implementation when necessary. I think we're all running around in circles wrt to this entire library at this point. Two people in here have already suggested that they're working on designs for total overhauls, so I'm not sure how we want to proceed (especially considering that I haven't seen public demonstrations of either of these initiatives yet). |
Sorry for the delay @arthurprs, but I wanted to say thanks again for the PR! The libs team discussed this PR during triage yesterday, and the conclusion was that for now we don't want to include any other RNG implementations in the Unfortunately this crate is lacking a vision of how to proceed forward which makes it difficult to say whether we want this sort of RNG eventually. We hope to start devoting some time to this crate soon, but help is always appreciated! In the meantime I'm gonna close this, but please feel free to publish this on crates.io! |
Thanks Alex, I'll move the code to a crate. |
I took a couple of days to research some fast pseudo-random number generators and I thought it'd be nice to contribute to the crate.
This PR contains two commits one adding the xorshift128+ and the other PCG RXS M XS (32 and 64bit variants). I don't think there's precedence for adding both, so I think we should discuss how to proceed.
x64 benchmarks (my laptop)
x86 benchmarks (DO)
PCG is neat, it's very compact and provide good quality, even in the 32bit variant!
Although my preference would be to add the xorshift128+ variant and leave the xorshift128 there (probably worth a rename though).
In addition weak_rng should return the fastest algorithm available for the architectures. As 64bit operations kills performance in 32bit architectures this can be as simple as Xorshift64 for 32bit and Xorshift128+ for 64bit.
Finally, users won't need to look elsewhere for a crazy fast random implementation.
Thoughts?