Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

box syntax creates Box<T> where T is not yet initialized #97270

Open
matthiaskrgr opened this issue May 22, 2022 · 16 comments
Open

box syntax creates Box<T> where T is not yet initialized #97270

matthiaskrgr opened this issue May 22, 2022 · 16 comments
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-miri Area: The miri tool C-bug Category: This is a bug.

Comments

@matthiaskrgr
Copy link
Member

I tried this code:

#![feature(box_syntax)]

fn main() {
    let _x = box return;
}

miri reports UB on this code:

rning: unreachable expression
 --> src/main.rs:4:14
  |
4 |     let _x = box return;
  |              ^^^^------
  |              |   |
  |              |   any code following this expression is unreachable
  |              unreachable expression
  |
  = note: `#[warn(unreachable_code)]` on by default

error: Undefined Behavior: type validation failed: encountered a box pointing to uninhabited type !
 --> src/main.rs:4:14
  |
4 |     let _x = box return;
  |              ^^^^^^^^^^ type validation failed: encountered a box pointing to uninhabited type !
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

  = note: inside `main` at src/main.rs:4:14

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error; 1 warning emitted

Meta

rustc --version --verbose:

rustc 1.63.0-nightly (9257f5aad 2022-05-21)
binary: rustc
commit-hash: 9257f5aad02b65665a6e23e5b92938548302e129
commit-date: 2022-05-21
host: x86_64-unknown-linux-gnu
release: 1.63.0-nightly
LLVM version: 14.0.4
Backtrace

error: Undefined Behavior: type validation failed: encountered a box pointing to uninhabited type !
 --> src/main.rs:4:14
  |
4 |     let _x = box return;
  |              ^^^^^^^^^^ type validation failed: encountered a box pointing to uninhabited type !
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

  = note: inside `main` at src/main.rs:4:14
  = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:248:5
  = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:122:18
  = note: inside closure at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:145:18
  = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:280:13
  = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:492:40
  = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:456:19
  = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14
  = note: inside closure at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:128:48
  = note: inside `std::panicking::r#try::do_call::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:492:40
  = note: inside `std::panicking::r#try::<isize, [closure@std::rt::lang_start_internal::{closure#2}]>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:456:19
  = note: inside `std::panic::catch_unwind::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14
  = note: inside `std::rt::lang_start_internal` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:128:20
  = note: inside `std::rt::lang_start::<()>` at /home/matthias/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:144:17

error: aborting due to previous error; 1 warning emitted

/tmp/f on  ma

@matthiaskrgr matthiaskrgr added C-bug Category: This is a bug. A-miri Area: The miri tool labels May 22, 2022
@RalfJung
Copy link
Member

RalfJung commented Jun 3, 2022

Ah fun times, we are actually generating a Box<!> here?

Maybe Box should just be less special in terms of validity etc than it is now...

@RalfJung
Copy link
Member

This is the same problem that also led me to exclude Box in #128531: box foo compiles to

  • Create an uninitialized box
  • Then initialize it

However, the uninitialized box already has type Box<T>, and that's fundamentally incompatible with any kind of recursive validity checking.

@RalfJung RalfJung changed the title miri: box_syntax: type validation failed: encountered a box pointing to uninhabited type ! box syntax creates Box<T> where T is not yet initialized Aug 18, 2024
@RalfJung RalfJung added the A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html label Aug 18, 2024
@RalfJung
Copy link
Member

@rust-lang/wg-mir-opt (is that the right ping group for MIR building questions?)

Currently, this

#![feature(rustc_attrs)]

pub fn new<T>(x: T) -> Box<T> {
    #[rustc_box]
    Box::new(x)
}

creates MIR roughly like this

fn new(_1: T) -> Box<T> {
  let mut _0: std::boxed::Box<T>;
  let mut _4: *mut u8;

  _4 = alloc::alloc::exchange_malloc(...)
  _0 = ShallowInitBox(move _4, T);
  _5 = (((_0.0: std::ptr::Unique<T>).0: std::ptr::NonNull<T>).0: *const T);
  (*_5) = move _1;
  return;
}

The problem with this is that we call ShallowInitBox before the box has been initialized. Unsurprisingly, if we have Miri ensure that Box<T> actually refers to an initialized T, this explodes.

Is it possible for MIR generation to only call ShallowInitBox when the box contents are actually initialized, or alternatively for ShallowInitBox to not "lie" in its return type where it currently claims that we have a Box<T>?

This has probably been a problem since #89030, maybe longer than that.
Cc @nbdd0121

@nbdd0121
Copy link
Contributor

This issue is certainly not introduced by ShallowInitBox. Before ShallowInitBox, box creation happens by a NullaryOp which just gives an uninitialized box, and ShallowInitBox changes the process to be alloc and then turn the pointer to a shallow inited box.

We marked the rvalue returned by ShallowInitBox as shallowly initialized so the box is valid but not its contents. This mechanism is introduced in #43772

@RalfJung
Copy link
Member

We marked the rvalue returned by ShallowInitBox as shallowly initialized so the box is valid but not its contents. This mechanism is introduced in #43772

That's an initialization tracker / drop elaboration thing, though, if I understand correctly. It is not reflected in the type and therefore the operational semantics of the program.

@RalfJung
Copy link
Member

I guess another possible fix would be to say that ShallowInitBox does an untyped copy of the argument into the destination place. But that's inconsistent with how all the other Rvalue work -- they generally compute values, hence their name, and values are typed. So if we go that route we should represent this differently, e.g. as a statement or a NonDivergingIntrinsic.

@nbdd0121
Copy link
Contributor

The shallow initialized box is not very different from a Box where its content have been moved. I guess the issue here is with uninhabited types there's no notion of a moved-out box of that type?

Also, could you explain why a value of Box<!> could not exist? Why couldn't we treat it similar to other custom smart pointers where their validity is shallow, so their mere existence is fine if they are not dereferenced?

@RalfJung
Copy link
Member

RalfJung commented Aug 18, 2024

The shallow initialized box is not very different from a Box where its content have been moved. I guess the issue here is with uninhabited types there's no notion of a moved-out box of that type?

The problem is that operationally speaking, we are moving the shallow initialized box: _0 = ShallowInitBox(move _4, T); moves it from _4 to _0. We never allow moving boxes whose contents have been moved out so there it's not a problem.

Also, could you explain why a value of Box<!> could not exist? Why couldn't we treat it similar to other custom smart pointers where their validity is shallow, so their mere existence is fine if they are not dereferenced?

See rust-lang/unsafe-code-guidelines#413 (and note that Box is a primitive pointer type and thus a lot like a reference; it's not just a library smart pointer).

@nbdd0121
Copy link
Contributor

nbdd0121 commented Aug 18, 2024

The shallow initialized box is not very different from a Box where its content have been moved. I guess the issue here is with uninhabited types there's no notion of a moved-out box of that type?

The problem is that operationally speaking, we are moving the shallow initialized box: _0 = ShallowInitBox(move _4, T); moves it from _4 to _1. We never allow moving boxes whose contents have been moved out so there it's not a problem.

I see. In this case I think making this a statement / non-diverging intrinsic probably makes sense.

@est31
Copy link
Member

est31 commented Aug 22, 2024

What about having ShallowInitBox returning a type that can actually point to uninitialized memory? Say NonNull<T>? NonNull<!> is allowed, right? I'd like to avoid moving around the content pointed to, say from a MaybeUninit<T> to a Box<T>: ideally we just cast pointers.

@nbdd0121
Copy link
Contributor

The point of ShallowInitBox is to deallocate the box when the boxed expression fails, so it should return a box type.

We could codegen as allocating a Box<MaybeUninit<T>> and then derive a pointer from it, cast to *mut T, use it's deref as place for the boxed expression, then assume_init, although we would be generating much more MIR in that case and I don't know what the perf impact would be like.

@RalfJung
Copy link
Member

We could also have a ShallowBox type representing that state.

@RalfJung
Copy link
Member

RalfJung commented Aug 23, 2024

To be clear, this is not something that must be fixed. But if we keep the status quo, that has a few consequences (none of them terrible):

  • ShallowInitBox has a precondition that the type is inhabited (which indeed the #[rustc_box] attribute already ensures), or Box<!> is an inhabited type.
  • Validity of Box<T> does not recursively require validity of the inner T.

The latter is my preferred resolution of rust-lang/unsafe-code-guidelines#412 (which applies to references and Box alike, though in principle we could decide to treat them differently), but that question is generally considered "not resolved".

@RalfJung
Copy link
Member

Yet another alternative would be to use Box::new_in(val, Global) instead of making Box::new a lang item... but I guess there is a reason that we don't already do that?

@nbdd0121
Copy link
Contributor

nbdd0121 commented Sep 12, 2024

It's difficult to ensure in-place codegen that way, IIRC there were a few attempts to remove special handling of Box::new but performance degradation was quite significant.

@RalfJung
Copy link
Member

RalfJung commented Sep 12, 2024

Hm... @scottmcm did a bunch of work recently on making our MIR nicer before it is shipped to LLVM, I wonder if that could be applied to Box::new_in to overcome this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-miri Area: The miri tool C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

4 participants