Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern-matching related soundness concerns #42

Closed
3 tasks done
RalfJung opened this issue Apr 24, 2020 · 15 comments
Closed
3 tasks done

Pattern-matching related soundness concerns #42

RalfJung opened this issue Apr 24, 2020 · 15 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Apr 24, 2020

Recently it was pointed out to me that there is an entire category of const-related soundness concerns that I was unaware of, and that is not mentioned with a single word in this repository: pattern matching. I think for the most part, const soundness and "const FOO = expr is equivalent to inlining expr" suffice to ensure pattern-matching-soundness, but there is one exception: const-soundness as described and discussed so far stops at references, but when pattern-matching on consts, the content behind the reference can matter.

For example, the following is completely fine as far as "inlining-soundness" is concerned, but it is pattern-unsound (and let's ignore aliasing issues for now):

static mut BYTES: [u8; 16] = *b"1234567890abcdef";
const BYTES_REF: &[u8] = unsafe { &BYTES };

fn main() { unsafe {
  BYTES[0] = 2;
  assert!(matches!(b"1234567890abcdef", BYTES_REF));
} }

Inlining semantics are entirely preserved when a const just points to mutable data, but the same becomes a big problem with pattern matching.

Some things I think we should do:

  • Have a document describing pattern-matching-soundness concerns in this repository (Cc add document on consts in patterns #48). For this, determine how const-patterns are compiled and which of they participate in exhaustiveness checking.
  • Improve our dynamic checks for pattern-matching-soundness (aka "consts are immutable") to avoid relying on a global invariant.
  • Maybe restrict patterns to types with a good notion of equality? That would exclude function pointers and raw pointers.
@RalfJung
Copy link
Member Author

Get an exhaustive list of the kind of consts involving references that we can pattern-match on. Is it just &str and &[u8] (and we could get away with special hacks targeting them) or is there more?

@oli-obk @eddyb would be nice if you could fill me in here. :)

@oli-obk
Copy link
Contributor

oli-obk commented Apr 24, 2020

We can pattern match on all kind of references. It's just that right now only &str and &[u8] (and maybe &[u8; N]) are actually used for exhaustiveness. This means that at runtime, we fall back to == checking, treating the whole constant as opaque input, but you can still pattern match on a constant of type &&&Foo for any Foo that you can normally pattern match on.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 28, 2020

Okay, wow, that's a big one. I am so glad we caught this before landing rust-lang/rust#71332. ^^

It's just that right now only &str and &[u8] (and maybe &[u8; N]) are actually used for exhaustiveness.

How can &str and &[u8] ever be exhaustive without a catch-all clause (_ pattern)?

@RalfJung
Copy link
Member Author

Ah, this compiles:

#![feature(exclusive_range_pattern)]
#![feature(half_open_range_patterns)]
const PAT: &[u8] = &[0];

pub fn test(x: &[u8]) -> bool {
    match x {
        PAT => true,
        &[] => false,
        &[1..] => false,
        &[_, _, ..] => false
    }
}

@comex
Copy link

comex commented Apr 28, 2020

Hmm... As far as I can tell, the only way to mutate data within a constant without aliasing-UB is with interior mutability. But that requires UnsafeCell, and you can't pattern match on UnsafeCell from outside libcore. So maybe this is sound?

@joshtriplett
Copy link
Member

I feel like I'm missing some aspect of const semantics, but this seems like something that we should be able to catch at compile time by noticing that we've taken a reference to the mutable value, so it shouldn't be directly accessible for mutation anymore because that reference has it borrowed.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 30, 2020
…=oli-obk

Miri: better document and fix dynamic const pattern soundness checks

rust-lang/const-eval#42 got me thinking about soundness for consts being used in patterns, and I found a hole in our existing dynamic checks: a const referring to a mutable static *in a different crate* was not caught. This PR fixes that. It also adds some comments that explain which invariants are crucial for soundness of const-patterns.

Curiously, trying to weaponize this soundness hole failed: pattern matching compilation ICEd when encountering the cross-crate static, saying "expected allocation ID alloc0 to point to memory". I don't know why that would happen, statics *should* be entirely normal memory for pattern matching to access.

r? @oli-obk
Cc @rust-lang/wg-const-eval
@rpjohnst
Copy link

@joshtriplett The reference in the const does not exist "ambiently" the way the static does. It only exists at the points where the const is used- in one sense it is re-created each time it is evaluated. If consts were actually evaluated at runtime, this would be fine, but they are instead evaluated at compile time.

This makes the rules a bit more complex- their goal is to ensure that the result of evaluating the constant would be identical every time even if it happened at runtime. So references to mutable values need to be prevented in the definitions of consts themselves, rather than the more usual borrowck behavior of preventing mutations while those references are live.

@RalfJung
Copy link
Member Author

rust-lang/rust#71655 landed and documents the invariants a bit better that make sure this is sound. I think I finally understood why @oli-obk kept insisting on all these extra checks against consts pointing to statics.

So I think right now, everything is sound. However, it relies on a rather subtle global invariant that no pointer to a static (including pointers to "things inside statics") ever leaks to a non-static. I'd be more happy if we could check consts-dont-point-to-mutable more directly than via an invariant that encompasses all of tcx.alloc_map.

And the other problem is that we have a fundamental conflict here: consts pointing to interior mutable statics would, in principle, pose no problems for const-soundness and aliasing. However, that conflicts with using consts in patterns. It seems like we can have one or the other feature with consts, but not both -- and, worse, we already decided in favor of patterns, without (as far as I can tell) being aware that this closes the door on other const possibilities.

@comex
Copy link

comex commented May 2, 2020

Okay, after being confused for a while, I think I understand the situation better.

The example in the first post is not UB due to aliasing, assuming the const is inlined at the point of use, since there are no conflicting references to BYTES at the point BYTES_REF is used.

Also, contrary to my last comment, you can get the same effect with interior mutability:

use std::cell::UnsafeCell;
static BYTES: UnsafeCell<[u8; 16]> = UnsafeCell::new(*b"1234567890abcdef");
const BYTES_REF: &[u8] = unsafe { &*BYTES.get() };

fn main() { unsafe {
  (*BYTES.get())[0] = 2;
  assert!(matches!(b"1234567890abcdef" as &[u8], BYTES_REF));
} }

From a user's perspective, it seems like the simplest mental model is: the pointer value of BYTES_REF is a constant expression, but the result of dereferencing it is not a constant expression. Therefore, either you can't match against it, or such a match would never be treated as exhaustive. Similarly, you wouldn't be able to dereference it within const-eval. But you could still create it.

Apparently things get more complex with const generics because they currently treat all references that point to the same data as equivalent. I will post in that thread as well.

@RalfJung
Copy link
Member Author

RalfJung commented May 2, 2020

The example in the first post is not UB due to aliasing, assuming the const is inlined at the point of use, since there are no conflicting references to BYTES at the point BYTES_REF is used.

Well... I'd just say the aliasing model doesn't really comment on this example yet.

But as you said, the same effect can be achieved with interior mutabiltiy -- static mut just makes it a bit easier. That's why I said we can ignore the aliasing rules.

From a user's perspective, it seems like the simplest mental model is: the pointer value of BYTES_REF is a constant expression, but the result of dereferencing it is not a constant expression.

That's a good way to put it.

Therefore, either you can't match against it, or such a match would never be treated as exhaustive.

Probably we just shouldn't permit to match on it (assuming we ever allow such constants at all), everything else seems really surprising.

@RalfJung
Copy link
Member Author

I opened rust-lang/rust#74446 to better figure out our story around which types are allowed for consts in patterns. From what I understand, the soundness part of this is mostly about which of those take part in exhaustiveness checking.

@oli-obk is this still true?

It's just that right now only &str and &[u8] (and maybe &[u8; N]) are actually used for exhaustiveness.

Also, are we all in agreement that for no we should not expand the set of types that are considered for exhaustiveness, to avoid introducing further problems, until we figured out the whole match story a bit better?

Though given that we have some cases where exhaustiveness is based on const values, maybe it doesn't make much of a difference to add more of those cases...

@oli-obk
Copy link
Contributor

oli-obk commented Jul 24, 2020

@oli-obk is this still true?

My statement was meant for constants of reference type. All other constants have been doing exhaustive checking since quite some time.

@RalfJung
Copy link
Member Author

Quoting you from another issue:

integers do exhaustiveness checks. so do structs without private fields and enums.

@RalfJung
Copy link
Member Author

I think things are a lot better here now than they used to be, due to valtrees. But I don't know the exact rules according to which we currently determine exhaustiveness, so let's keep this issue open until we got those documented and sanity-checked.

@RalfJung
Copy link
Member Author

All computing of information for the exhaustiveness check goes through valtrees, and all reads that are performed during valtree construction go through the const-eval machine configured such that reads from anything mutable are rejected. So at runtime we'll never see values that are different from what exhaustiveness checking saw. Also I think we don't even access that memory, we are making a copy for our valtree anyway.

So we can close this issue. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants