Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace convert_slice_{32,64} with read_u{32,64}_into #77

Merged
merged 2 commits into from
Dec 27, 2017

Conversation

pitdicker
Copy link

And a little cleanup around the init functions

And a little cleanup around the init functions
Copy link
Owner

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my comments are either feature creep or something we should have picked up before. I know from_rng is often used with fresh entropy, but it can be used to seed from a deterministic master PRNG, so I think it should be portable. Agree?

/// Reads unsigned 32 bit integers from `src` into `dst`.
/// Borrowed from the `byteorder` crate.
#[inline]
pub fn read_u32_into(src: &[u8], dst: &mut [u32]) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine. Oh, you copied the name from byteorder? I guess it'll do.

unsafe {
let ptr = seed.as_mut_ptr() as *mut u8;
let slice = slice::from_raw_parts_mut(ptr, SEED_WORDS * 4);
other.try_fill(slice)?;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're converting from bytes to u32 here. I think this will produce different results on BE CPUs? We went to all that trouble to make sure fill_bytes is portable, so I think this should be too. Maybe the best option is to use from_seed here :/

unsafe {
let ptr = seed.as_mut_ptr() as *mut u8;
let slice = slice::from_raw_parts_mut(ptr, 8 * 4);
let slice = slice::from_raw_parts_mut(ptr, SEED_WORDS * 4);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ChaCha: watch Endianness

unsafe {
let ptr = key.as_mut_ptr() as *mut u8;
let ptr = seed.as_mut_ptr() as *mut u8;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here too

unsafe {
let ptr = key.as_mut_ptr() as *mut u8;
let ptr = seed.as_mut_ptr() as *mut u8;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four are now share the same code 😄

@pitdicker
Copy link
Author

pitdicker commented Dec 22, 2017

I am not sure what you mean with feature creep.

I know from_rng is often used with fresh entropy, but it can be used to seed from a deterministic master PRNG, so I think it should be portable. Agree?

I would argue that we should be able to rely on from_rng to get actual random numbers that are not reproducible, especially for cryptographic RNG's like here. And if someone wants reproducible results from_seed should be used, from_rng is the wrong API.

But making from_seed endian-independed should not be difficult, and we can just as well do it.

I played with from_rng calling from_seed. It would do some unnecessary (unaligned) copying, but that should not really have measurable impact. For ISAAC we use much more bytes as seed with from_rng than with from_seed, so there it would not work. Iterating over the seed with to_le in from_rng is maybe the easiest option.

@dhardy
Copy link
Owner

dhardy commented Dec 22, 2017

I suppose we can just document that from_rng is not portable, so SubRng::from_seed(master_rng.gen()) or similar should be used.

Of course both from_seed and from_rng can be used with fresh entropy to get non-reproducible numbers. As @burdges has already pointed out, the two are redundant, apart from portable reproducibility and optimisation, so I guess you are right.

@pitdicker
Copy link
Author

Thinking some more about it, a function that just works without pitfalls is worth more than a bit simpler implementation, or optimization on uncommon architectures (which are probably unmeasurable small for from_rng).

I will update the from_seedimplementations this evening or tomorrow, and add tests for it.

What did you mean with feature creep?

@dhardy
Copy link
Owner

dhardy commented Dec 22, 2017

My comments about requiring from_rng to be portable is feature creep; we didn't require this before.

If we make from_rng endian-fixed then I wonder why we should have it at all (it should be possible to write MyRng::from_seed(rng.gen()) eventually I think). The other point is something I mentioned before: making non-portable RNGs like StdRng implement SeedFromRng but not SeedableRng; if both methods are portable then this decision no longer makes sense. So maybe we should just drop from_rng entirely?

Currently PRNGs implement Rand. We could potentially implement this (or a distribution which then implements Rand) automatically for anything implementing SeedableRng (though this is a breaking change since existing Rand implementations must be removed).

@pitdicker
Copy link
Author

from_rng has a few advantages:

  • Xorshift and similar RNG's that don't work with some seeds can just ask for a new one.
  • Being portable does not make StdRng and other wrappers any more deterministic, it is still possible to swap the wrapped RNG. Not implementing from_seed helps make this clear a little more.
  • In try_fill we often have to do a copy from the output of an RNG to the potentially unaligned byte buffer. From this buffer it is copied (unaligned) into a seed in from_seed. from_rng avoids the extra copies. With a little effort LLVM may be able to optimize this out though.
  • from_rng can use a different seed length like it does now in ISAAC. And some optimizations are possible, like skipping an init function and filling the state directly. Although that may not be a good idea.

But I agree this are minor advantages.

I leave the call to you. Shall I leave from_rng as it is? Make it endianness-independend? Or merge it into SeedableRng and give it a default implementation?

@dhardy
Copy link
Owner

dhardy commented Dec 22, 2017

No, your arguments make sense. Aside from one thing: surely the seed buffer passed to try_fill will be aligned? If need be you could enforce the alignment yourself by creating it with word-sized elements then casting to a byte array.

By portable I mean produces the same results everywhere, including in future versions.

@pitdicker
Copy link
Author

surely the seed buffer passed to try_fill will be aligned

But is LLVM able to figure that out? If the functions are not inlined it has no choice but use slower unaligned copy functions. But that is something to try and play with, probably we can figure something out. Or just measure the impact, which may be tiny.

@pitdicker
Copy link
Author

I have added a commit that folds from_rng into the SeedableRng trait.

With some extra bounds on SeedRestriction it was possible to give from_seed a default implementation that uses from_seed. Performance optimizations for it is something I have not yet looked into, maybe not even worth it.

In Xorshift from_seed no longer panics, but replaces an all zero seed with 0xBAD5EED as discussed in #67. It still has a custom from_rng method to request new values if they are all zero (as unlikely as that is). That also now uses try_fill instead of next_u32 so it can pass on errors.

In Isaac{64} I kept the custom try_fill that fills the whole state, and made it endian-independent.

@dhardy
Copy link
Owner

dhardy commented Dec 23, 2017

Hang on, weren't you just arguing that from_rng should be kept separate?

@pitdicker
Copy link
Author

I though you were thinking in the opposite direction, and I saw these advantages only as 'minor' 😄. How do you think it turned out? We do not seem to lose much, because it is possible to override the default implementation.

@dhardy
Copy link
Owner

dhardy commented Dec 23, 2017

The most significant change is that StdRng must now implement SeedableRng which means there's no longer that "documentation" that the PRNG shouldn't be relied on to be reproducible. But both ways it's a minor thing really.

With specialisation we should be able to do a default implementation of from_rng even if the traits are separate, if we want to do that. I'm undecided.

@pitdicker
Copy link
Author

Yes, true. I don't think it will cause big problems though. There is good documentation, requiring deterministic numbers is (I think) not that common, and the fix is to just import the right RNG. And now we have one less trait for users of rand to think about.

@dhardy
Copy link
Owner

dhardy commented Dec 26, 2017

In that case, is there even much point in having from_rng at all? I guess it depends whether we can make X::from_seed(rng.gen()) work. I removed the code for generating fixed-size arrays with Rand early in my branch, but there's no reason it can't be added back (for distributions::Default), maybe for all array sizes up to 32 plus some more power-of-two sizes.

This brings up another point, the rationale for from_rng was initially to replace Rand implementations to prevent naive users from writing rng.gen() where rng is XorShiftRng or another weak RNG with potential accidental-copy issues. Assuming we actually replace XorShiftRng we don't need to worry so much about this. On the other hand, I suppose from_rng can now require a CryptoRng source while leaving it easy for users to seed from a non-crypto RNG if desired.

Note: almost all of this still needs merging into upstream, and needs some rationale.

@pitdicker
Copy link
Author

almost all of this still needs merging into upstream, and needs some rationale.

And from_rng is a new method, so be must be able to 'sell' it 😄.

I think we can say from_rng is not absolutely needed. But that you added it feels like the right call to me. It already gives minor advantages for Xorshift and ISAAC.

I am not sure if this is the right way to think about it, but to me the available methods in an API tell me how it is supposed to be used.

  • How do I initialize this PRNG? Seed from another RNG with from_rng.
  • How do I seed this PRNG in a reproducible way? Use from_seed.
  • Are reseeding frequently or custom reseeding algorithms important? We don't really consider it to be, so reseed is not part of the core API, but a wrapper.
  • How do I use this ergonomically? NewSeeded provides a new method around from_rng. A from_hash method helps to make from_seed easier to use.

Not checked yet, but X::from_seed(rng.gen()) does not really work well with our new error handling, right? Not unsolvable of course, just add a try_gen method. Is it fair to say gen is more suited or even intended for PRNG's, not really as an interface for things like OsRng?

@dhardy
Copy link
Owner

dhardy commented Dec 27, 2017

Good point about the error handling. All the distributions and Rand code assumes infallible RNGs; i.e. panic or best-effort on error, so X::from_seed(rng.gen()) should be workable (with a few changes) but can't do proper handling like XorShiftRng::from_rng(rng) might retry on bad seed.

Adding rng.try_gen()? would require fallible distributions — implementing it for byte-array types would be easy enough, but e.g. generating floating-points fallibly would require duplicating a lot of code (or abstracting over error handling), and generating integers would really need try_next_u32 (we never quite agreed we don't need that... grimace).

Ok, so we do want from_rng after all.

@dhardy dhardy merged commit 5d9acdc into dhardy:master Dec 27, 2017
@pitdicker pitdicker deleted the blockrng branch January 21, 2018 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants