-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do a basic sanity check for all constant values #51361
Conversation
if self.validate_ptr_target(ptr, ptr_align, field_layout).is_ok() { | ||
return Ok(()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we make any validity assumptions for what is inside a union. But this is an open topic for discussion, and RFC-worthy. Saying "at least one field has to be valid" is probably conservative, but there should at least be a comment saying that no final decision has been made yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can turn this check into a lint that notes that we don't know if this is correct, pending an RFC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having an error is okay, as it's forwarding-compatible to allow this later (right?). But either the error or the code should say that this is still subject of discussion.
I don't understand the motivation. What is this solving? Where in http://play.rust-lang.org/?gist=99912c4c87d4d676817c8a7263289c9b&version=undefined&mode=undefined would you expect which kind of error to be raised? |
}, | ||
Scalar::Ptr(_) => { | ||
// we cannot reason about Ptr numeric ranges other than zero | ||
if bound.start_bound() == Bound::Included(&0) && bound.end_bound() == Bound::Included(&0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this says that the range has to be just (0,0) for the pointer to invalidate the range? When would such ranges even occur? I have no idea what scalar.valid_range
is for e.g. pointer types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I just did the most conservative thing i could think of. Usually you have 0..
or 1..
as the valid range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this means usually, this will never fire?
Also, raw pointers can be null. Does this check to make sure we're talking about a reference?
And finally, what about alignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this means usually, this will never fire?
Yea, unless you define a clike enum with 2^32-1 variants without it having one where the discriminant is 0
Also, raw pointers can be null. Does this check to make sure we're talking about a reference?
Oh that's not the issue, if you have a 0
you do not get into the Scalar::Ptr
arm, this is an abstract pointer. You'd be in the Scalar::Bits
arm.
Also we're not checking types here, we're checking layout. And the layout of a raw pointer allows having a 0
.
And finally, what about alignment?
That would require us to look at the type. I can add more checks, I just wanted to start with the very basic layout checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's the Scalar
miri value thing, not layout.
Yea, unless you define a clike enum with 2^32-1 variants without it having one where the discriminant is 0
Now I am confused. The only way this fires, from what I can see, is having an enum where 0
is the only variant, and then having a pointer value at that enum type.
That would require us to look at the type.
I see, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid_range
is indeed 1..=0
for some full-range types.
You get 0..=0
for enum Foo { Bar = 0 }
Either way, it'd be simpler if you always created two complete RangeInclusive's rather than things like ..=hi and lo...
indeed, no generics means I can use a closure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0..=0
meaning exactly one value, 0
, makes sense, but at the same time, it means your check for pointers doesn't make sense, since they'd never have the 0..=0
range, would they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pointer check is checking the opposite. It checks if you have a type enum Foo { Bar = 0 }
but the value is a pointer. So if you transmuted a &1
to a Foo
, then this check triggers. Transmuting a &1
to an enum Foo { Bar = 42 };
is uncheckable at compile-time, because the pointer might actually point to address 42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's confusing. Why not check that the range includes all the possible values of the pointer? Sadly, this doesn't work very well with splitting into two ranges, I don't think, so the splitting should probably only be done for Scalar::Bits
.
A Scalar::Pointer
could have any address, so assuming it's any specific one, is generally a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. I fixed this
layout::FieldPlacement::Union(0) => { | ||
match layout.abi { | ||
// nothing to do, whatever the pointer points to, it is never going to be read | ||
layout::Abi::Uninhabited => Ok(()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, shouldn't it be insta-UB to see a reference to something uninhabited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be a *const !
? idk, I'll add some tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behind a *const T
, nothing has to be valid, the pointer can be null or unaligned or dangling -- so none of this validation should ever run for these. Or did I misunderstand something?
trace!("variant layout: {:#?}", layout); | ||
} | ||
match layout.fields { | ||
// primitives are unions with zero fields, because that's totally obvious and documented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth pointing at documentation/reference? unless it's worth refactoring this weirdness out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should check layout.abi
before looking at layout.fields
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layout.fields
is for getting field offsets, and only Array
has any guarantees attached to it. If there are 0 fields, the choice doesn't matter and I didn't want to put another Option
around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I documented it ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this sounds like indeed, it should first check layout.abi
?
--> $DIR/union-const-eval-field.rs:38:5 | ||
| | ||
LL | const FIELD3: Field3 = unsafe { UNION.field3 }; //~ ERROR E0080 | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ attempted to read undefined bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, this is taking a strong stance in one particular camp of unsafe code guidelines! Do we really want this to error so early?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do 0u8 - 1
inside a constant you also get a lint about an unusable constant. You can still publish your crate with these broken constants, but actually using them anywhere will produce erorrs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what that statement has to do with my question... this is a hard error, is it not? It says ERROR in all-caps. How's this related to overflow lints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh... indeed. I need to check that out... goot catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the code again. Using a broken constant to obtain a runtime value will error, not lint + produce an undef. So maybe we cannot do this PR because it would be a breaking change.
Also, given that the goal seems to be to maintain some kind of invariant a la "values of type X are always satisfying property Y", I'd like to see some kind of high-level documentation what these invariants are, and where they are maintained. |
layout::FieldPlacement::Union(0) => { | ||
match layout.abi { | ||
// nothing to do, whatever the pointer points to, it is never going to be read | ||
layout::Abi::Uninhabited => Ok(()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check is only done in this case, is that because it's the (only?) base case of recursive validate_ptr_target
?
if so is it only the inner-most types that have this property on the other cases? as in, is it valid to recurse down to here on nested types or would they themselves be Uninhabited in a way that would mean they aren't recursable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. I'll investigate. But I also do remember that there were some issues with aggregates containing uninhabited types not being uninhabited, because you could create a local variable of such an aggregate and then initialize the inhabited parts
I assumed that if you break any rules of the layouting stuff, then you are in guaranteed UB-land. They seemed like the absolute lowest bar for sanity checks.
I assumed it was bad to have constants that allow you to safely create Miri is already differing from runtime behaviour if undefined bytes are involved, because if one byte of a
The "error" is a lint, and it is raised at the definition of It is only an error if you are initialzing a All this PR does is move errors from use-sites to declaration-sites or emit lints at declaration-sites and keep producing undef at use-sites. |
What's safe about this? The code has
LLVMs semantics in this case are unclear. Also, how is this dangerous? This will just lead to more error messages, right?
The test case says "ERROR 0080: attempted to read undefined bytes", are we talking about the same thing? You didn't answer my question, though -- but I think I found part of the answer: The problem is that this reads 8 bytes, 4 of which are undef, into a My question about what problem you are even fixing here remains open. |
Agreed. But e.g. layouting stuff doesn't look behind pointer indirections (which OTOH, Also, that only answer the part of which invariant you intend to maintain, not when. I assume it's "all the time", i.e. whenever such a value is created in any possible way? |
if that were in a crate, the user of the constant doesn't know that. Now I know that with functions you have the same problem, but this is const eval, the value of a constant is part of the API.
I'm just pointing out that there's weird stuff going on, and I don't think const eval should randomly change what it's doing even in the presence of UB. Instead we should be aborting, like we already do for other kinds of UB.
The lint is deny-by-default.
Inside the function there's a constant. That constant's value is then returned.
I'm just trying to catch more bad stuff earlier.
Well... to check whether a value is a valid u64 you need to access the value, and that access is failing. It's not the actual checking code ;) We do not traverse through pointers, no, we'd need to act upon type information for that.
The layout is a scalar with
I'm fine with just checking the final initializer value of a static/constant. You're right, the current check is a little overeager and also checks the locals in const FOO: usize = {
let x = 42;
let y = x + 1;
y
}; |
You still seem to assume that I know way more than I do. :) Which part do you move to another create to make this a problematic example? Everything but
Well yes, and I am still asking why this doesn't happen already but you Remember, I am not familiar at all with how the outside world (aka rustc) interacts with miri.
Okay so the fact that this is a
Ah! That's nice. Could we have a test that transmutes
Now I feel we're getting there. To repeat my question from above, what is currently checked on the final initializer value? I stated above that I assume it must not be undef, you didn't comment on that. |
None => return err!(InvalidChar(c as u128)), | ||
self.memory.dump_alloc(ptr.alloc_id); | ||
trace!("validate_ptr_target: {:#?}", layout); | ||
if let layout::Variants::Tagged { .. } = layout.variants { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there's an invariant that says we don't have to do while let
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly how we do it elsewhere. We'll have to change a lot of things if that invariant is invalidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. :)
Oh, that's what this does. Could we get a rustdoc comment on |
just the constants, make them public, and then access them from the given functions
It is not, except that we know there's an undef here. It's just as if you wrote
It's solving the problem that you can accidentally do UB in constants and not notice until it becomes UB in your binary and starts to segfault.
because noone implemented this PR before ;)
No, the const fn is irrelevant indeed. I just took the original failing test and adjusted it.
transmute is not const fn, but I already added a test for union-converting 42 to a bool: https://github.com/rust-lang/rust/pull/51361/files#diff-457c99ee767b575ea31313a78d84dbceR36
yay!
Oh, so any constant has a backing |
The constants inside the functions? Okay. I'd never have guessed that.^^
👍
Ah, and the current version doesn't even complain when a And then what does trans do? Emit an LLVM undef? Or an error? And the goal is to make sure that the trans error/undef never happens (i.e., turn it into an ICE) by doing more checks when the However, calling I tried looking at master to see what happens, but right now, I cannot find a single caller for |
no, thus this PR ;)
before this PR it produced an undef, now it produces an error
great!
Just an oversight
I moved it to the
Yea, it was commented out. Must have happened during some PR/rebase/refactoring. |
--> $DIR/ub-enum-ptr.rs:23:1 | ||
| | ||
LL | const BAD_ENUM: Enum = unsafe { Foo { a: &1 }.b}; //~ ERROR this constant cannot be used | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a raw memory access tried to access part of a pointer value as raw bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the error we would expect? Shouldn't it be something like "out of bounds"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't report out of bounds before we know the bits, reading the bits will result in this error though ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that error is not great though because nothing here tries to read the pointer as bytes, Also, I don't think I understand which line in validate_scalar
triggers this -- the Scalar::Ptr(_)
case doesn't do any reads.
IOW, how is this failure at all related to Enum
having 0
as its only value? If it's about the read, it'll probably happen with any enum, right? So, the comment about non-null in the test seems to have nothing to do with what the test actually tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the issue...
enum Foo {
Bar = 0,
}
has Tagged
variants, so reading the tag will bug out.
I'll do what you proposed all along and move the Abi
check up (just for Uninhabited
and Scalar
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so it's a Scalar
but also Tagged
? Makes sense.
We'd still get that error though of it's an enum
with data. Could we get a more specific error message like "invalid: enum discriminant is a pointer" or "error determining enum discriminant" in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I can just run the scalar-checks on the discriminant before trying to read the discriminant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I thought of a simple .map_err
, but I have no idea what would work better.
Trans hasn't been touched though, so if something slips through the test it'll still emit undef. Can we safeguard against that? Or is undef actually legitimate e.g. if the constant is a struct with padding? Is it a problem that
Great, thanks. I'm not necessarily opposed to checking these things earlier, I just wanted to understand what is being checked and, more important, where -- How far does layout validation go? E.g., if the type is |
Yes undef bytes are legitimate. We could probably guard against all bits being undef and just bail out for that, too, but I'm not sure if that code wouldn't be dead anyway with the above checks in place
I'll try to cook up something sensible. I didn't want to go overboard with this PR.
Oh yea :D we read the discriminant with the regular miri-discriminant reader, which totally produces an error if that discriminant does not exist. If it does exist, then we change the layout to the variant that the discriminant selects and check whether that layout fits the bill. |
Fair enough.
Awesome! |
This comment has been minimized.
This comment has been minimized.
4d647cb
to
61c4741
Compare
@@ -1406,18 +1407,17 @@ impl<'a, 'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> EvalContext<'a, 'mir, 'tcx, M | |||
|
|||
/// This function checks the memory where `ptr` points to. | |||
/// It will error if the bits at the destination do not match the ones described by the layout. | |||
/// | |||
/// Note: this function does not transitively dereference other pointers. All that is checked | |||
/// for pointers is whether they are not allowed to be zero, but are. | |||
pub fn validate_ptr_target( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it still doesn't transitively do anything itself, it just fills seen
and todo
accordingly, right? The docs should say that.
// this would be nice-to-have, but we are inside the query that | ||
// evaluates said static, and const eval will try to run this query again | ||
// if we do this kind of validation | ||
// We could run the checks as an additional lint after evaluation is done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm thinking we should totally do this generally. If we accidentally make any checks too strict, users can opt out of them for specific constants until we fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "generally"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding new checks to const eval as lints instead of breaking the evaluation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deny-by-default lint? Yeah I guess that makes sense. We can continue compilation, after all. It just probably doesn't do what people want (as in, it raises UB...).
I see you now added some kind of logic to validate pointers transitively. What I am surprised by is that this needed some new kind of traversal logic; shouldn't similar logic exist already to "translate" CTFE results (transitively through pointers) into things that will later be passed to LLVM? |
🔒 Merge conflict This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again. How do I rebase?Assuming
You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial. Please avoid the "Resolve conflicts" button on GitHub. It uses Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Error message
|
34a85dd
to
86e59cc
Compare
Rebased. @bors r=nikomatsakis |
📌 Commit 86e59cc has been approved by |
Do a basic sanity check for all constant values ## Motivation and high level overview There has been some back and forth in this PR between @RalfJung and me in here about the motivation for this change and the stance it takes on unsafe coding guidelines. The initial implementation ran its checks on every value read (so `*x`, `y = x`, ...). In unsafe code that isn't reasonable, because we might be invalidating invariants for a short time in order to build up a proper value. The current implementation is a lint that runs its checks statics and constants. There is no need to check array lengths and enum variants, because it's a hard error to end up with anything but a number, and that one even has to have the required bits to be defined. ## What checks are done? * Some type related checks * `char` needs to be a correct unicode character as defined by `char::from_u32` * A reference to a ZST must have the correct alignment (and be nonzero) * A reference to anything is dereferenced and its value is checked * Layout checks use the information from `ty::Layout` to check * all fields of structs * all elements of arrays * enum discriminants * the fields of an enum variant (the variant is decided by the discriminant) * whether any union field succeeds in being checked (if none match the memory pattern, the check fails) * if the value is in the range described by the layout (e.g. for `NonZero*` types) Changing the layout of a type will thus automatically cause the checks to check for the new layout. fixes #51330 fixes #51471 cc @RalfJung r? @eddyb
☀️ Test successful - status-appveyor, status-travis |
Tested on commit rust-lang/rust@70cac59. Direct link to PR: <rust-lang/rust#51361> 🎉 miri on windows: build-fail → test-pass. 🎉 miri on linux: build-fail → test-pass.
drive by comment, this is a dramatic performance regression. perf |
Ok, sorry for the spam. I looked thru the end of this thread and did not see a comment or link. I am glad that it has been noticed. |
This breaks the unions with generic type that uses dropping: pub const fn transmute<S>(x: S) -> [usize; 64] {
#[allow(unions_with_drop_fields)]
union U<T> {
a: T,
b: [usize; 64]
}
unsafe { U { a: x }.b }
} |
I'm not sure I count this as breaking. Using your function is UB at runtime. It also is unrelated to drop. the issue is that when you use it for a type smaller than 64 usizes, you get usizes which are undefined, and reading those is UB, and if your type is bigger, you are throwing away part of it. Can you provide your use case so we can find the solution to your problem without invoking UB? |
Agreed -- erroring here is "working as intended", to avoid undefs floating around. |
Motivation and high level overview
There has been some back and forth in this PR between @RalfJung and me in here about the motivation for this change and the stance it takes on unsafe coding guidelines.
The initial implementation ran its checks on every value read (so
*x
,y = x
, ...). In unsafe code that isn't reasonable, because we might be invalidating invariants for a short time in order to build up a proper value.The current implementation is a lint that runs its checks statics and constants. There is no need to check array lengths and enum variants, because it's a hard error to end up with anything but a number, and that one even has to have the required bits to be defined.
What checks are done?
char
needs to be a correct unicode character as defined bychar::from_u32
ty::Layout
to checkNonZero*
types)Changing the layout of a type will thus automatically cause the checks to check for the new layout.
fixes #51330
fixes #51471
cc @RalfJung
r? @eddyb