Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CryptoRng in RngCore trait objects #1143

Closed
aticu opened this issue Jul 10, 2021 · 26 comments · Fixed by #1187
Closed

Support CryptoRng in RngCore trait objects #1143

aticu opened this issue Jul 10, 2021 · 26 comments · Fixed by #1187

Comments

@aticu
Copy link

aticu commented Jul 10, 2021

Summary

Change CryptoRng such that it can be used as a random number generator through a trait object, by making it a subtrait of RngCore.
This would break existing CryptoRng implementations, which were previously not implementing RngCore, but those were likely of limited use.

Details

Change the CryptoRng trait from this

pub trait CryptoRng {}

to this

pub trait CryptoRng: RngCore {}

Motivation

Currently the use of RngCore is supported in trait objects, however because of how trait objects are implemented, it is not possible to use CryptoRng in addition in the trait object, thus making it impossible to actually use a CryptoRng as a random number generator in trait objects.

This change is really small and likely has very little negative impact on users who don't need this change, but it can enable another way to use the API for users who do need it.

Alternatives

  • Add a CryptoRngCore trait to the rand_core crate:

    trait CryptoRngCore: RngCore + CryptoRng {}
    impl<T: RngCore + CryptoRng> CryptoRngCore for T {}

    This would allow users to use RngCore + CryptoRng types through trait objects as well, but at the cost of an additional trait in the (I presume intentionally) small rand_core crate.

  • Do nothing and let users implement the CryptoRngCore trait above if they need it.

    This allows them to use RngCore + CryptoRng types through trait objects, but since it isn't a unified solution, multiple equivalent traits with the exact same purpose and definition might pop up over time.

    Optionally this pattern could also be documented in the CryptoRng trait.

@dhardy
Copy link
Member

dhardy commented Jul 10, 2021

Thanks for the clear RFC. I agree that the initial proposal makes sense, though it will likely to a while to land (or at least to affect downstream crates like rand) since it is a breaking change.

Either way, we should allow some time for comments before making a decision.

@newpavlov
Copy link
Member

I wonder why we haven't done it initially... I vaguely remember there was kind of a reason for the current design, but I can't remember what was it exactly. If there are no significant obstacles to it, I also would prefer the subtrait approach.

@vks
Copy link
Collaborator

vks commented Jul 10, 2021

Would it be possible to cast a dyn RngCore to a dyn CryptoRng?

@aticu
Copy link
Author

aticu commented Jul 10, 2021

Would it be possible to cast a dyn RngCore to a dyn CryptoRng?

I don't think that would be possible even with unsafe code. At least not without undefined behavior, since you'd need have a valid dyn CryptoRng vtable somewhere, which is different from the vtable of dyn RngCore. Even if their layout is likely to be identical, this isn't guaranteed (unless I'm missing something here).

There would also not be any guarentee that the original RngCore object also implements the CryptoRng trait, making it UB to cast it without checking that some other way, even if such a cast was possible safely.

@burdges
Copy link
Contributor

burdges commented Jul 11, 2021

I'd expect casts from dyn RngCore to dyn CryptoRng would never work, but more importantly rust has no roadmap for casts from dyn CryptoRng to dyn RngCore, making them incompatible for the foreseeable future. ATCs make casting among trait objects possible but complex, ala https://internals.rust-lang.org/t/casting-families-for-fast-dynamic-trait-object-casts-and-multi-trait-objects/12195

We never knew if CryptoRng makes much sense: We've so many like ChaChaRng where security depends upon the seed, so usually one wants a random oracle here. We never knew if associated constants would eventually do this better or if more detailed information helps.

pub enum RngSecurity {
    Garbage,
    NumericalMethods,
    ???,
    RandomOracle,
    RandomnessSource,
}
pub trait RngCore {
    const SECURITY: RngSecurity = RngSecurity::RandomOracle;
    fn next_u32(&mut self) -> u32;
    fn next_u64(&mut self) -> u64;
    fn fill_bytes(&mut self, dest: &mut [u8]);
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;
}

I suppose one solution would be to deprecate CryptoRng but encourage people say what they want:

pub trait RngCore {
    fn information() { eprint!(::std::any::type_name()) }
    fn contract(msg: &str) {
        #[cfg(feature = "contractreport")]
        {  information();  eprintln!("RngCore contract: {}", msg);  }
    }
    fn next_u32(&mut self) -> u32;
    fn next_u64(&mut self) -> u64;
    fn fill_bytes(&mut self, dest: &mut [u8]);
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;
}

pub trait SeedableRng: Sized {
    fn information() { eprint!(::std::any::type_name()) }
    fn contract(msg: &str) {
        #[cfg(feature = "contractreport")]
        {  information();  eprintln!("SeedableRng contract: {}", msg);  }
    }
    type Seed: Sized + Default + AsMut<[u8]>;
    fn from_seed(seed: Self::Seed) -> Self;
    fn seed_from_u64(state: u64) -> Self { ... }
    fn from_rng<R: RngCore>(rng: R) -> Result<Self, Error> { ... }
    fn from_entropy() -> Self { ... }
}

so folks could insert any warnings the like via say

fn information() { eprint!("Non-RandomOracle {}", ::std::any::type_name()) }

and if you ask then you get messages like "Non-RandomOracle XorShiftRng contract: Random Oracle".

Ideally formal methods and design by contract folks would spend more time developing rust tooling outside rustc itself.

@dhardy
Copy link
Member

dhardy commented Jul 11, 2021

After sleeping on it I remember there is a reason we can't just implement this: CryptoRng is implemented on non-RNGs such as ChaChaXCore so that CryptoRng is implemented on the derived RNG.

So maybe the CryptoRngCore alternative is a better idea.

@newpavlov
Copy link
Member

CryptoRng is implemented on non-RNGs such as ChaChaXCore so that CryptoRng is implemented on the derived RNG.

Ah, yes! It was exactly that. I think that instead of CryptoRngCore a better alternative would be to introduce CryptoBlockRng:

trait CryptoBlockRng: BlockRngCore {}
trait CryptoRng: RngCore {}

impl<T: BlockRngCore> RngCore for BlockRng<T> { .. }
impl<T: CryptoBlockRng> CryptoRng for BlockRng<T> {}

@aticu
Copy link
Author

aticu commented Jul 11, 2021

I think that instead of CryptoRngCore a better alternative would be to introduce CryptoBlockRng:

While I do think that would be a better solution when designing a new API, it likely isn't feasible in this situation, since does have a much bigger impact on downstream users.

CryptoRng + BlockRngCore would go from semantically meaning "this is a block RNG suitable for cryptographic use" to meaning "this is an RNG suitable for cryptographic use that also happens to be a block RNG not necessarily suitable for cryptographic use".

Completely changing the meaning of traits (especially "marker" traits like CryptoRng) is a change that may even silently break things in some situations.

Seeing how CryptoRng is also used for BlockRngCore, I also think that a CryptoRngCore trait (and CryptoBlockRngCore) would be the better solution here.

@dhardy
Copy link
Member

dhardy commented Jul 11, 2021

@aticu I agree that the less breaking change is desirable. The naming of your suggestions in unfortunate though:

  • RngCore is what RNGs implement; Rng is the automatically impl'd user trait for all RngCore
  • CryptoRng is a pure marker trait; the proposed CryptoRngCore is that marker + an extension over RngCore

@aticu
Copy link
Author

aticu commented Jul 11, 2021

@dhardy Adding a new trait wouldn't be a breaking change at all.

Naming is (as always) tough to get right.
I think you parsed the name CryptoRngCore differently.
I read it as Crypto(RngCore), whereas I think you read it as (CryptoRng)Core.
I do agree that this ambiguity is a problem.

Here are some alternative naming ideas, thought I'm not a huge fan of any one of them:

  • CryptoCompatibleRngCore & CryptoCompatibleBlockRngCore
  • RngCoreWithCrypto & BlockRngCoreWithCrypto
  • RngCoreWithCryptoRng & BlockRngCoreWithCryptoRng
  • RngCryptoCore & BlockRngCryptoCore

I personally don't really care a lot what the name would be if this were added, but I still think a good name is important, so if you have any better ideas, I'd be open to them.

@burdges
Copy link
Contributor

burdges commented Jul 11, 2021

We should seriously consider deprecating CryptoRng entirely because honestly dyn RngCore vs dyn CryptoRngCore sounds like a Hobson's choice or worse. It's possible this problem simply lies out of scope for rust's type system.

Aside from the fn contract(msg: &str) and associated const tricks I mentioned upthread, we might ask @RalfJung if he foresees any interesting tricks for enforcing this sort of contract with miri runs not traits, and maybe the hacspec authors @denismerigoux and @franziskuskiefer if they see anything.

An RFC for permitting dyn RealTrait + CustomMarkerTrait may already exists too.

@newpavlov
Copy link
Member

newpavlov commented Jul 11, 2021

@aticu

CryptoRng + BlockRngCore would go from semantically meaning "this is a block RNG suitable for cryptographic use" to meaning "this is an RNG suitable for cryptographic use that also happens to be a block RNG not necessarily suitable for cryptographic use".

Completely changing the meaning of traits (especially "marker" traits like CryptoRng) is a change that may even silently break things in some situations.

I should've been more clear: the proposal was for the next breaking release of rand_core. Of course, in the described form it's a clear breaking change which should not be shipped in a patch release. Assuming it will be done in rand_core v0.7, in the worst case scenario developer of an implementation crate during migration to v0.7 may forget to change impl from CryptoRng to CryptoBlockRng for its block RNG. Users of this crate may then encounter a compilation error, which should be fixable quick enough by reporting a bug in the implementation crate. Yes, it could be somewhat annoying, but nothing close to the silent breakage mentioned in your post.

I think that the CryptoRngCore trait could be a fine stop-gap solution for rand_core v0.6. Another possible alternative is to introduce the two sub-traits without removing the existing CryptoRng:

// will be removed in rand_core v0.7
trait CryptoRng {}

trait CryptoBlockRngCore: BlockRngCore {}
trait CryptoRngCore: RngCore {}

impl<T: BlockRngCore> RngCore for BlockRng<T> { .. }
impl<T: CryptoBlockRngCore> CryptoRngCore for BlockRng<T> {}

impl<T: CryptoRng + BlockRngCore> CryptoBlockRngCore for T {}
impl<T: CryptoRng + RngCore> CryptoRngCore for T {}

// alternatively we can add impls in different direction:
impl<T: CryptoBlockRngCore> CryptoRng for T {}
impl<T: CryptoRngCore> CryptoRng for T {}

It will cause a bit of churn in downstream crates, but I think replacing impl CryptoRng + RngCore with impl CryptoRngCore.

@burdges

We should seriously consider deprecating CryptoRng entirely

As one of the RustCrypto maintainers I am strongly against that. I do not want for our users to accidentally plug clearly insecure RNGs into cryptographic code. Yes, the CryptoRng trait is not an absolute protection, but it's significantly better than nothing.

@dhardy
Copy link
Member

dhardy commented Jul 12, 2021

@burdges are you referring to Rust's (current) lack of trait-object upcast (i.e. inability to convert dyn CryptoRngCore to RngCore)? We can work around this by adding an fn as_rng_core(&mut self) -> &mut dyn RngCore method.

Overall I am in favour of @newpavlov's suggestion (CryptoBlockRng and change CryptoRng in the next breaking release).

@aticu is there any urgency to support CryptoRng trait objects within rand_core on your part? I presume not since you mentioned a work-around in downstream crates.

@burdges
Copy link
Contributor

burdges commented Jul 12, 2021

Yes fn as_rng_core(&mut self) -> &mut dyn RngCore works, but overall a naive marker trait does not fit the goals of CryptoRng terribly well. We should explore less naive alternatives, which capture more nuance, some of which I listed above.

If all you wanted was a CryptoRng-style marker then this achieves the same goals without all the crap

pub trait RngCore {
    /// Panic in test builds unless this `RngCore` is a CSPRNG
    ///
    /// Any `impl RngCore for ..` for a CSPRNG should add `fn assert_cryptorng(&self) { }` so that 
    /// this assertion always passes.  
    ///
    /// If you want a CSPRNG is desired then invoke `rng.assert_cryptorng("reasons")`, which does nothing in release builds, but non-CSPRNGs panic in tests. 
    fn assert_cryptorng(&self, msg: &str) {
        #[cfg(test)]
        panic!("{} is not a cryptographic secure PRNG!  {}", ::std::any::type_name(), msg);
    }
    fn next_u32(&mut self) -> u32;
    fn next_u64(&mut self) -> u64;
    fn fill_bytes(&mut self, dest: &mut [u8]);
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;
}

I suggest we do something similarly unorthodox but that captures CSPRG better than CryptoRng does.

@vks
Copy link
Collaborator

vks commented Jul 12, 2021

@burdges You are using strong words against CryptoRng, but I still don't understand your argument. What exactly is the problem with the extension to CryptoRng discussed here? As @dhardy mentioned, the lack of casting dyn CryptoRng to dyn RngCore can be worked around. As far as I can see, supporting upcasting like this in the language is desired, but not yet implemented.

The alternatives you suggested move the problem from compile time to run time. I don't understand why this would be preferable, or more nuanced.

@dhardy
Copy link
Member

dhardy commented Jul 12, 2021

To be fair to @burdges, adding a method for trait object upcast is not a pure compile time solution either.

@burdges
Copy link
Contributor

burdges commented Jul 12, 2021

I mentioned one solution that moves the problem to test time, not runtime. It avoids further trait gymnastics inside rand's already complex trait hierarchy. You need CryptoRng + Rng too, not just CryptoRng + RngCore btw.

Above I also mentioned a solution that becomes compile-time once const generics supports bounds, but currently requires test time test time asserts. I doubt up casting would land before const generics bounds.

pub trait RngCore {
    const fn is_crypto(&self) -> bool { false }     // or maybe const CRYPTO: bool = false; ??
    fn next_u32(&mut self) -> u32;
    fn next_u64(&mut self) -> u64;
    fn fill_bytes(&mut self, dest: &mut [u8]);
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;
}

debug_assert!(rng.is_crypto());

I've actually made no concrete proposals however. Instead I've raised four points:

  1. At first blush CryptoRngCore looks like fairly hefty churn and maybe harms usability long-term.
  2. CryptRng was never well defined and only addresses part of the problem.
  3. Afaik, trait bounds cannot express the real problem, which involves provenance.
  4. Rust's ecosystem increasingly includes people who do consider such problems more rigorously.

I've actually suggested

  1. we discuss solutions that support provenance, which yes wind up test time not compile time. Why does being compile time matter?
  2. We solicit opinions of people who build systems around rust for more rigorous code analysis.

I've not yet suggested we add some #[cfg(test)] is_crypto: bool into rand_chacha::guts::State etc. because imho doing this without discussing things with verification folk seems premature. Also we've multiple transformations folks consider, like merely being secret vs a random oracle.

@vks
Copy link
Collaborator

vks commented Jul 12, 2021

@burdges Thanks for the clarification!

@aticu
Copy link
Author

aticu commented Jul 12, 2021

@newpavlov

Here is what I meant by silent breakage:

Assume I maintain a library with the following API:

fn foo<T: BlockRngCore + CryptoRng>(rng: &mut T);

If these changes were made, my code wouldn't break, since I can still use BlockRngCore normally.
Now a user can use a BlockRngCore that is not cryptographically secure with my library, as long as it also implements RngCore and CryptoRng.
That means my library works under the wrong assumptions until someone reports it to me (I suspect users of the library would notice this, since not all types are likely to be both a BlockRngCore and an RngCore).
Not the worst kind of breakage, but the library author would not notice anything until users reported the issue.

@dhardy

There is no urgency at all.
I just thought it was an odd omission and wondered if anybody else thought this would be useful.

We can work around this by adding an fn as_rng_core(&mut self) -> &mut dyn RngCore method.

Unfortunately this method would have to be implemented by each type implementing CryptoRng.
A default implementation does not work in this case (unless there is a work around that I don't know about).

@burdges

You need CryptoRng + Rng too

CryptoRng + Rng would not be necessary, because CryptoRng + RngCore would already implement Rng through the blanket implementation for RngCore types.

CryptRng was never well defined and only addresses part of the problem.

This is true, but moving the problem to const generics bounds or any other mechanism changes nothing about that.
I don't think it was ever supposed to be well defined, but rather act as a sort of lint to catch undesirable combinations early enough.
This could of course be done through other methods, such as your assert_cryptorng method, but that would require users to know about it and to run tests, I'd argue makes it less effective.

Afaik, trait bounds cannot express the real problem, which involves provenance.

That is true, but I don't think that this problem is even solvable in a general way.
Suppose I implement a driver for a new hardware RNG, with which I then wish to seed a software RNG.
How would any automated tooling be able to differentiate between random data from a hardware RNG and predictable data from other peripheral devices?

Why does being compile time matter?

Because it is guaranteed to be enforced.
You mentioned that the Rust ecosystem increasingly includes people who do consider such problems more rigorously, but at the same time it also includes more careless users who don't run tests and don't care about these things.
Of course these are bad practices, but I think it would still be useful to support even these users in the best way possible.

@dhardy
Copy link
Member

dhardy commented Jul 12, 2021

but that would require users to know about it and to run tests

For many libraries, I imagine it would be preferable to assert_cryptorng at run-time during normal usage too than assume users will run the tests — though this is problematic in environments which must be panic-free.

we discuss solutions that support provenance, which yes wind up test time not compile time

We can certainly discuss such things, but this is significantly more complex than CryptoRng, and I imagine would always involve assumptions (e.g. that the driver for your hardware RNG doesn't have a serious flaw and that there is no exposure to a side-channel attack). I'm certainly not an expert here but I doubt such a method is sufficiently reliable to be useful, or that attaching a formal definition to the usage of CryptoRng would be useful in static analysis / verification.

It's not just provenance (by which I assume you mean the source of the seed), but also side-channel attacks and decisions about which algorithms are sufficiently robust to gain a CryptoRng impl.

@burdges
Copy link
Contributor

burdges commented Jul 12, 2021

Interesting..

We all agree that CryptoRng exists to prevent homebrew schemes using algorithms that'll never really provide good randomness. We've seen such mistake often historically, with another found just last week even https://twitter.com/matthew_d_green/status/1412412907028156420 Although not well defined, CryptoRng does this job, and that's important.

Because it is guaranteed to be enforced.

Yes I agree with this more now after some thought.. If someone miss-uses algorithms then they may not even write tests, which sounds archaic but yeah happens. I believe runtime panics in release builds goes too far here, so yeah #[cfg(test)] tricks cannot suffice if we're idiot proofing.

I spoke up forcefully about this because I've seen idiot proofing via traits go badly repeatedly. I saw dyn RngCore+CryptoRng, and immediately though "another one bites the dust", but yeah maybe CryptoRng survives somehow. We should figure out how much pain and churn CryptoRngCore actually brings..

I've seen discussion about supporting dyn Trait + CustomMarkerTrait where trait CustomMarkerTrait {} contains no methods, but maybe no RFC exists since I cannot locate the comment right now. If this happened, then CryptoRng works again.

Are there any indications about const generics bounds being a near future option? If so, then an associated const sounds kinda interesting.

pub trait RngCore {
    const CRYPTO: bool = false;
    fn next_u32(&mut self) -> u32;
    fn next_u64(&mut self) -> u64;
    fn fill_bytes(&mut self, dest: &mut [u8]);
    fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;
}

That is true, but I don't think that this problem is even solvable in a general way.

It's not solvable in rust's trait system. It's presumably not even solvable if fancy formal verification languages like F*. I do however think code could assist humans in explaining why what they do is secure, but increasingly I suspect this might prove orthogonal to idiot proofing.. I just dislike it being orthogonal. ;)

Alright maybe I derailed things somewhat, but at least a couple different rustc features would address this, if any would land soon enough.

@josephlr
Copy link
Member

josephlr commented Jul 13, 2021

Personally, I like the alternative solution proposed by @newpavlov in #1143 (comment). It's backwards compatible, and is the only proposed solution that allows us to support upcasting without relying on yet-to-be proposed features. For example (omitting the BlockRng stuff which is similar):

// Could optionally be removed in rand_core v0.7
trait CryptoRng {}

trait CryptoRngCore: RngCore {
  fn upcast(&mut self) -> &mut dyn RngCore;
}

impl<T: CryptoRng + RngCore> CryptoRngCore for T {
  fn upcast(&mut self) -> &mut dyn RngCore {
    self
  }
}

Then everything continues working without any of the silent breakage @aticu described. But this feature now allows users to start using &mut dyn CryptoRngCore in their API, and casting to &mut dyn RngCore if necessary, preventing @burdges's "Hobson's choice". An implementation can require a cryptographically secure rng, while still making use of helper libraries that just need a &mut dyn RngCore.

Then we could either:

  • Remove CrytoRng as a breaking change
    • This would require RNG users to switch from generic bounds like T: CryptoRng + RngCore to T: CryptoRngCore
    • This would require RNG implementers to switch from implementing just the CryptoRng marker trait to something like:
      impl CryptoRngCore for MyRng {
        fn upcast(&mut self) -> &mut dyn RngCore {
          self
        }
      }
  • Just keep the CryptoRng marker around, and have the documentation make clear which traits are the best to use.
    • This allows us to avoid churn for RNG users, and avoid having implementers need to have a dummy 3-line impl

Unfortunately, we can't make this process generic, so having:

trait Crypto<Tr>: Tr {
    fn upcast(&mut self) -> &mut dyn Tr;
}

won't work.

If so, then an associated const sounds kinda interesting.

@burdges the associated const (or const fn) makes the trait no longer object-safe (regardless of if associated generic bounds ever happen), so it's likely a non-starter.

@burdges
Copy link
Contributor

burdges commented Jul 13, 2021

Yes right now, I'd hope they become object safe in two ways eventually, by allowing a where Self: Sized bound on the associated const, which does not help us, and by being fully specified, which does help us. Anyways yes you're right this option would not work anytime soon.

@dhardy
Copy link
Member

dhardy commented Jul 13, 2021

@josephlr that's the best proposal yet, thanks.

Going back to the initial post:

Do nothing and let users implement the CryptoRngCore trait above if they need it.

This is also a viable choice IMO, and hopefully eventually Rust will support dyn (RngCore + CryptoRng). (As @burdges mentioned there is discussion on trait objects + marker trait bounds, but that doesn't mean it'll happen soon.)

@aticu
Copy link
Author

aticu commented Jul 13, 2021

So the way I see it, the CryptoRng trait is a marker trait, which changes its semantics based on which other traits are implemented.

Here is my current understanding of the semantics:

RngCore implemented? BlockRngCore implemented? Meaning of CryptoRng
No No Probably none?
Yes No RngCore impl is suitable as CSPRNG
No Yes BlockRngCore impl is suitable as CSPRNG
Yes Yes Probably both are suitable as CSPRNG?

I'd argue that these changing semantics aren't necessarily desirable, so I also think the best option would be what @newpavlov and @josephlr proposed: Deprecate CryptoRng entirely and add new traits which are subtraits of RngCore and BlockRngCore, which are empty except for a single method to allow upcasting.

@franziskuskiefer
Copy link

Aside from the fn contract(msg: &str) and associated const tricks I mentioned upthread, we might ask @RalfJung if he foresees any interesting tricks for enforcing this sort of contract with miri runs not traits, and maybe the hacspec authors @denismerigoux and @franziskuskiefer if they see anything.

I'm not sure hacspec is the right vehicle for something like this. Something along the lines of the contract crate is probably more suitable. But I'm not sure this is really solving the issue at hand.
Implementations that just plug in some RNG should rely on a marker trait like CryptoRng (even though as this discussion shows it needs improvements). Implementations that really want (need) to know what sort of randomness they get shouldn't rely on a marker trait or a contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants