Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic-on-uninit: adjust checks to 0x01-filling #101061

Merged
merged 1 commit into from
Oct 5, 2022

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Aug 26, 2022

Now that mem::uninitiailized actually fills memory with 0x01 (#99182), we can make it panic in a few less cases without risking a lot more UB -- which hopefully slightly improves compatibility with some old code, and which might increase the chance that we can check inside arrays in the future.

We detect almost all of these with our lint, so authors of such code should still be warned -- but if this happens deep inside a dependency, the panic can be quite interruptive, so it might be better not to do it when there is no risk of LLVM UB. Therefore, adjust the might_permit_raw_init logic to care primarily about LLVM UB. To my knowledge, it actually covers all cases of LLVM UB now.

Fixes #66151

Cc @5225225

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 26, 2022
@rust-highfive
Copy link
Collaborator

r? @petrochenkov

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 26, 2022

// Some things that should be caught, but currently are not.
let _val: OneFruitNonzero = mem::zeroed();
let _val: OneFruitNonzero = mem::uninitialized();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to improve the lint to detect these as well... @5225225 any good ideas for how to do that? I think for non-is_multi_variant enums, we probably can compute the variant that does exist, and then we can recursively check if that one is valid? We'd also need a smarter is_multi_variant that can handle some dataful variants (ensuring they are definitely inhabited).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at making the lint better a while back, and didn't get very far (though, granted, I didn't look too hard). I'll have a look again tomorrow, we definitely should be able to detect this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this into a separate issue: #102043

@5225225
Copy link
Contributor

5225225 commented Aug 26, 2022

I wonder how many crates would be "saved" by this more forgiving form of the check.

As far as I know, the last time we made the checks here stricter was in 2020, with the aggregate checks (#71274). So someone would have needed to have not recompiled with an up to date rustc in 2 years, and the code needs to not be doing any other crimes.

Which, yeah, can definitely happen, but I'm unsure if the panics in their current form are causing enough problems to warrant turning them off. Though yes, the 0x01 filling means that let x: bool = mem::uninit() is not UB by what we give LLVM, but it is UB by the text of the implementation (though then again, so is basically every call to mem::uninit() :) )

I don't think decreasing the amount of panics here helps us check inside arrays in the future, since we can always use less-strict rules inside arrays (As I did in the PR). Like, "treat values inside arrays as if they are 0x01 filled".

I agree that it's unlikely that LLVM will find more ways to Do A Bad if we panic less here, but I'd not bet on it, and a panic is a lot better of an experience for the user than an SIGILL or something stranger.

In the end, I'm somewhat against going against our ratcheting up of the checks, especially since it's been so long since we last made them stronger. Even if, yes, it would be more inline of what we tell LLVM is UB.

@petrochenkov
Copy link
Contributor

r? @scottmcm (who approved #99182)

@scottmcm scottmcm added the I-libs-nominated Nominated for discussion during a libs team meeting. label Aug 26, 2022
@scottmcm
Copy link
Member

Hello libs folks! I'm nominating this for input on whether you think removing panics here is the way to go.

I agree with Ralf that we certainly could, since these aren't actually UB as currently implemented, but they're still unsound as documented, so I'm not certain what the best thing to do is.

Do we have any idea whether this would unbreak some projects in practice?

@scottmcm scottmcm added the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Aug 26, 2022
@RalfJung
Copy link
Member Author

RalfJung commented Aug 27, 2022

I don't think decreasing the amount of panics here helps us check inside arrays in the future, since we can always use less-strict rules inside arrays (As I did in the PR). Like, "treat values inside arrays as if they are 0x01 filled".

"the PR" = #99389?

Yeah we can, though I feel like that leads to very ad-hoc rules that are hard to explain or predict. Panics in mem::uninitiailized are a last resort, e.g. they sometimes lead to segfaults because of code that -- quite reasonably -- assumes that mem::uninitiailized will never unwind. So I think it makes sense to avoid panicking when we don't have to, in particular given the -Z flag you added which still lets people detect these bad mem::uninitiailized if they need to.

Hyper should work with this PR even if we recurse below arrays. I'm thinking that's material for a follow-up PR though.

@RalfJung
Copy link
Member Author

I agree with Ralf that we certainly could, since these aren't actually UB as currently implemented

Note that this is only almost true. This PR stops panicking on &u8, for example, but 0x0101010101010101 is not dereferenceable for 1 byte, so this causes LLVM UB due to the dereferenceable(1) that we will add.

But I have never heard of that UB actually leading to issue for variables which are truly unused -- dereferenceable(1) is useful for LLVM to speculatively move accesses out of loops or conditionals, but if there are no accesses, it is unlikely to lead to speculation.

@5225225
Copy link
Contributor

5225225 commented Aug 27, 2022

Yeah we can, though I feel like that leads to very ad-hoc rules that are hard to explain or predict.

True, but we're not explaining these rules. I don't think teaching the rules here are very important, since no new code should ever run into this, we're making decisions on what to panic on purely based on what existing code happens to do.

We don't even mention these panics, let alone mention when they happen. We hint at the 0x01 filling by saying

It also might be slower than using MaybeUninit due to mitigations that were put in place to limit the potential harm caused by incorrect use of this function in legacy code

but we don't mention the panics.


Panics in mem::uninitiailized are a last resort, e.g. they sometimes lead to segfaults because of code that -- quite reasonably -- assumes that mem::uninitiailized will never unwind.

Yeah, that is an issue. It would be good if we could print a backtrace and immediately abort.

Wait, we can do that, even in core, can't we?

struct DropBomb;

impl Drop for DropBomb {
    fn drop(&mut self) {
        panic!("double panicking to abort");
    }
}

fn panic_abort() {
    let _guard = DropBomb;
    panic!("attempted to leave ! uninitialized")
}

This forces a stack backtrace and will abort, and presumably not do any more unwinding. So you'd make a guard at the top of uninitialized, then std::mem::forget it at the end.

(It doesn't skip executing any panic hooks, but I don't think the existing panic hooks are likely to cause problems here, and users likely don't add their own very often).

That would at least fix the "i panicked in a place they weren't expecting, and now i deadlocked the library", which was an issue in #100423 (comment) .

As a downside, it makes testing when std::mem::uninitialized panics an utter pain, since you can't catch aborts. We'd probably end up needing to just use the assert_uninit_valid/assert_zero_valid intrinsics directly, which isn't awful.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 27, 2022

For me it's a question of, what is the goal of these panics? They started out as being a way to tell people about Rust UB in their (or their dependency's code). But my impression is the goal has shifted -- your -Z flag now covers that goal (together with the lint, and Miri), and the goal of the "default" panics is merely to prevent LLVM UB. They are part of a mitigation strategy that also includes #99182. So the behavior of these panics should be re-evaluated under their new goal. And under this goal, there just is no good reason to panic on mem::uninitialized::<bool>() -- those panics don't mitigate any LLVM UB, they just gratuitously break code that used to work fine, code that was following the official recommendations at its time, but those recommendations turned out to be wrong. Therefore we should get rid of the panics and make that code work again. (Even if it is unlikely that there still much code out there that does this -- there's always the chance of someone coming back to their code 3 years later.)

That trick for the abort-with-backtrace sounds neat though, it is worth exploring!

@5225225
Copy link
Contributor

5225225 commented Aug 27, 2022

Hm... Yes, I suppose if we aim to just remove LLVM UB, then the weaker checks here are fine. We don't have to do any checks or even do the filling, but it's a polite thing to do, especially since we previously were wrong about the safety preconditions.

After all, the job of pushing mem::uninit out of dependency trees isn't really the job of the panics, it's more so the job of a FCW... once we get that.

@Noratrieb
Copy link
Member

In #100423, I want to use these checks to panic on invalid uses of MaybeUninit, which does not do 0x01 filling. How do we keep them for this case but not for mem::uninit?

Also, overall I don't think that it's a good idea to make them less strict. Just because it doesn't have LLVM UB now doesn't mean that it's ok to call mem::uninit. I think the panics are still good for guiding people away from that function and towards proper MaybeUninit usage.

@5225225
Copy link
Contributor

5225225 commented Aug 27, 2022

How do we keep them for this case but not for mem::uninit?

One thought would be to introduce new intrinsics that use the current const eval checks. (i.e. what you get by using -Zstrict-init-checks).

Those checks will panic if creating an uninit/zeroed const compile errors, so it's the strictest form of the checks we can do.

The justification there being "Look, this is obviously not old code, because you're using a post-1.36 API, and you're still doing it wrong", so the panic there is more justifiable. Though I'd be interested to see just how many people just use MaybeUninit::uninit().assume_init() in a UB way. I'd presume less than use mem::uninitialized, but it still definitely happens.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 27, 2022

In #100423, I want to use these checks to panic on invalid uses of MaybeUninit, which does not do 0x01 filling. How do we keep them for this case but not for mem::uninit?

I think a lint is a better tool to help people find and fix such code, than a run-time panic implemented via a MIR transform.

But if we do truly want to have such a pass, then I agree with @5225225 that it should just use a different intrinsic (or the intrinsic should have a flag).

@RalfJung
Copy link
Member Author

RalfJung commented Aug 27, 2022

Just because it doesn't have LLVM UB now doesn't mean that it's ok to call mem::uninit.

That is not what this PR is saying. It's definitely not okay to call that function. Please don't use strawman arguments.

But given that we (as in, the rustc team at large) used to explicitly recommend calling that function in ways that we now agree are clearly wrong, we have to tread very careful here, and not punish programmers for our mistakes. That's why I think we should focus on mitigation. And from a mitigation perspective, these panics are unnecessarily disruptive.

(It's different for assume_init which was always documented appropriately.)

@CAD97
Copy link
Contributor

CAD97 commented Aug 28, 2022

That trick for the abort-with-backtrace

This is fairly standard, but core now directly has the functionality for triggering a panic which aborts instead of unwinds even when panic=unwind (used for e.g. #[rustc_allocator_nounwind]), it's just not exposed quite yet (#99032 contains the glue to make it usable in core among other things).

Imho we should prefer using this functionality to using double panics where possible in std crates.

@bors
Copy link
Contributor

bors commented Aug 30, 2022

☔ The latest upstream changes (presumably #98919) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@cuviper cuviper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worries me to relax this when there are still so many gaps in might_permit_raw_init.

@@ -1495,9 +1495,11 @@ impl<'a, Ty> TyAndLayout<'a, Ty> {
///
/// This code is intentionally conservative, and will not detect
/// * zero init of an enum whose 0 variant does not allow zero initialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, how about an uninit enum which does have a 0x1-filled discriminant (or a discontiguous range spanning that), and then said variant (if any) may not allow 0x1-filling.

I also wonder how niches are handled with either 0 or 0x1-filling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are aggregate types for LLVM, I don't think we ever trigger LLVM UB in any of these cases. We do check that 0x01-filling is a valid discriminant (based on the validity range of the discriminant field), and we don't tell LLVM about the variants until one is actually accessed.

@@ -1495,9 +1495,11 @@ impl<'a, Ty> TyAndLayout<'a, Ty> {
///
/// This code is intentionally conservative, and will not detect
/// * zero init of an enum whose 0 variant does not allow zero initialization
/// * making uninitialized types who have a full valid range (ints, floats, raw pointers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we note references here? 0x1-filling is non-null, but still not dereferenceable (as mentioned) and also not aligned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot about alignment... it would be handled seamlessly here if we used 0x00..align as the niche, but right now we only have 0x00..0x01 so we'd not complain about e.g. &i32.

IMO the best way to fix that is to make the niche bigger...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why the niche matters, when a 0x1-filled reference means it has pointer 0x0101_0101_0101_0101.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to extend the niche of references from the current 0x00..0x01 to 0x00..align in the future, which would produce invalid values even with 0x01 filling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might_permit_raw_init uses the same information that is used for niches to determine whether a given value is valid for a given type.

Currently, that information says that 0x0101_0101_0101_0101 is valid for &i32, which leads to might_permit_raw_init allowing &i32 to be left uninit with 0x01-filling.

If we extend the niche to take into account alignment, that information will say that 0x0101_0101_0101_0101 is not valid for &i32, and hence trying to create a 0x01-filled &i32 would be rejected (as it should).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but those extended niche values are still far from the filled value we're talking about, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had a way to represent a niche that is "all the unaligned values", then that would be sufficient for this check -- 0x0101_0101_0101_0101 would be in the niche and hence disallowed for &i32.

But indeed we cannot even represent such niches, so contrary to what I said earlier we cannot rely on niches to detect the trouble around references.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, my last reply was having only seen what @Nilstrieb wrote - I guess GitHub failed to dynamically load @RalfJung's replies - but I think we're on the same page now.

So, is this something we should still worry about for LLVM UB in this context? We do tell LLVM about align, and their docs say, "If the pointer value does not have the specified alignment, poison value is returned or passed instead." But AIUI that's not UB until it's used in a particular way, so is it still a problem here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either @Nilstrieb or @5225225 already had a version of might_permit_raw_init that takes into account reference type information somewhere.

InitKind::Uninit => {
if let ty::Ref(_, inner, _) = this.ty.kind() {
if let ty::Slice(slice_inner) = inner.kind() {
let penv = ty::ParamEnv::reveal_all().and(*slice_inner);
if let Ok(l) = cx.tcx().layout_of(penv) {
return l.layout.align().abi == Align::ONE;
}
}
if ty::Str == *inner.kind() {
return true;
}
}
}

This? It's in my #99389 PR where we allowed 1-aligned references with no dereferencable (inside arrays only)

Copy link
Member Author

@RalfJung RalfJung Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cuviper poison value basically means uninit memory, which the 0x01-filling is intended to avoid, so I am leaning towards saying that this is LLVM UB we should avoid. (In particular, align noundef would make it insta-UB -- and that is the combination of attributes we want.)

@5225225 yes, that one, thanks!

@RalfJung
Copy link
Member Author

RalfJung commented Sep 8, 2022

It worries me to relax this when there are still so many gaps in might_permit_raw_init.

I don't follow -- we also want to strengthen it, sure, but we surely shouldn't do both strengthening and weakening in the same PR?

@cuviper
Copy link
Member

cuviper commented Sep 8, 2022

It worries me to relax this when there are still so many gaps in might_permit_raw_init.

I don't follow -- we also want to strengthen it, sure, but we surely shouldn't do both strengthening and weakening in the same PR?

I don't know, it's a weird situation. If we had only false positives or false negatives, then we could strictly improve, but right now we have both, so changes may end up in a see-saw. A given type can have fields that were rejected before and are now accepted by this PR, while also having fields that should be rejected but aren't because we don't check everything.

@craterbot
Copy link
Collaborator

🎉 Experiment pr-101061-1 is completed!
📊 25 regressed and 12 fixed (130 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Sep 22, 2022
@craterbot
Copy link
Collaborator

👌 Experiment pr-101061-2 created and queued.
🤖 Automatically detected try build 9ce6b522e9d830b1461df3e1b001b5831bd32346
⚠️ Try build based on commit 0a539cabce7e4fa09c1b403318abb9e13725e72b, but latest commit is 136ea3e57f7aed53b894436f6b0c301335987026. Did you forget to make a new try build?
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 22, 2022
@craterbot
Copy link
Collaborator

🚧 Experiment pr-101061-2 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-101061-2 is completed!
📊 10 regressed and 1 fixed (25 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Sep 22, 2022
@RalfJung
Copy link
Member Author

The rustc segfault remains but the other two are gone.

@RalfJung
Copy link
Member Author

@scottmcm this is ready for review, I think.

@saethlin
Copy link
Member

The rustc segfault remains

Looks to me like this is stack exhaustion due to recursion in a proc macro. I can repro locally, but it's not entirely consistent.

@RalfJung
Copy link
Member Author

Okay so unlikely to be caused by this PR, and probably just a coincidence to be called out as regression.

@bors

This comment was marked as resolved.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 5, 2022

Crater looks good, T-libs removed their nomination, impl lgtm

@bors r+

@bors
Copy link
Contributor

bors commented Oct 5, 2022

📌 Commit a0131f0 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 5, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 5, 2022
Rollup of 5 pull requests

Successful merges:

 - rust-lang#100986 (Stop suggesting adding generic args for turbofish)
 - rust-lang#101061 (panic-on-uninit: adjust checks to 0x01-filling)
 - rust-lang#102440 (Only export `__tls_*` on wasm32-unknown-unknown.)
 - rust-lang#102496 (Suggest `.into()` when all other coercion suggestions fail)
 - rust-lang#102699 (Fix hamburger button color)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit ab88c19 into rust-lang:master Oct 5, 2022
@rustbot rustbot added this to the 1.66.0 milestone Oct 5, 2022
@RalfJung RalfJung deleted the panic-on-uninit branch October 7, 2022 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking issue for panics in mem::uninitialized/zeroed