Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri interning: what to do with dangling pointers? #72215

Closed
RalfJung opened this issue May 14, 2020 · 9 comments
Closed

Miri interning: what to do with dangling pointers? #72215

RalfJung opened this issue May 14, 2020 · 9 comments
Labels
A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-cleanup Category: PRs that clean code up or issues documenting cleanup.

Comments

@RalfJung
Copy link
Member

What should the const interner do when encountering a pointer that doesn't point anywhere? Right now it does throw_ub_format!, but inside a const that is actually a lint that can be allowed. Also the error is not very informative as it doesn't say where in the const the dangling pointer is. With #71665 it does tcx.sess.span_err, at least guaranteeing to hard error, but still with an unspecific error and actually in case of a dangling reference with a duplicate error as validation also complains (with a much better error message).

Ideally we wouldn't have an error at all. We could either (1) make Miri+codegen not ICE on dangling AllocId, or (2) make the AllocId not dangling by creating "fake" allocations, or (3) adjust the interned allocation to not have a dangling pointer any more. There might be more options.

Is (1) realistic? I am not sure. It might also not be great to put that burden on the rest of the compiler. (Note though that since we "just" show a hard error diagnostic in dangling AllocId, all parts of the compiler that still run after this must anyway be prepared to handle them properly.)

(2) is honestly my personal favorite. The "fake" allocations would have the right alignment but size 1, and undefined content, and we could send them off to LLVM like that. However we'd have to make sure validation does not consider these "fake" allocations dereferencable (not even for size 0). So in some sense it might be easiest to create them only during codegen when getting None for a given AllocId... which kind of makes this a variant of (1), maybe? However we would have to ensure to still have alignment information in codegen.

For (3), the question is what exactly to replace them with. The easiest thing to do is to turn a pointer allocN+x into integer align_of(allocN)+x, which is at least a possible value for that pointer. @oli-obk expressed a preference for this option. However, I do not think this is a great idea. Consider the following program, which might actually run one day once we alllow more raw pointer stuff (currently it shouldn't run even with Miri unleashed):

const NASTY: (bool, *const i32, *const i32) {
  let x = 0; let y = 1;
  let x = &x as *const _; let y = &y as *const _;
  (x == y, x, y)
}

If we follow this variant of (3), the result will be (false, 4, 4). IOW, it looks as if CTFE concluded that 4 != 4... this is why I prefer (2).

Cc @rust-lang/wg-const-eval

@RalfJung RalfJung added C-cleanup Category: PRs that clean code up or issues documenting cleanup. A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) labels May 14, 2020
@oli-obk
Copy link
Contributor

oli-obk commented May 15, 2020

I have a 4th suggestion, which is both 1, 2 and 3: we create a fourth GlobalAlloc variant: Dangling(Align), which codegen uses to just generate an integer from that pointer as alternative 3 would have done in const eval. The advantage of this is that const eval keeps around an abstract pointer, while we don't have to make codegen generate actual allocations.

@RalfJung
Copy link
Member Author

RalfJung commented May 16, 2020

Codegen should make sure to generate different integers for different allocations though... we basically need a ptr-to-int scheme (and track the Size as well, not just the Align) to make sure we are fully consistent. And then it could still overlap with other actual allocations.

Maybe rejecting these pointers really is the best thing to do...

@oli-obk
Copy link
Contributor

oli-obk commented May 19, 2020

why would codegen need to generate different integers for different allocations? LLVM already doesn't guarantee us anything about addresses/duplication/deduplication except for the address of statics. It is totally possible for two different unnamed llvm const allocations to get deduplicated. At least that's how interpret

If the local_unnamed_addr attribute is given, the address is known to not be significant within the module.

from https://llvm.org/docs/LangRef.html

@RalfJung
Copy link
Member Author

why would codegen need to generate different integers for different allocations?

Because of the example from the OP:

const NASTY: (bool, *const i32, *const i32) {
  let x = 0; let y = 1;
  let x = &x as *const _; let y = &y as *const _;
  (x == y, x, y)
}

Even if we don't guarantee different allocations get different addresses, we IMO have to guarantee that whether or not two allocations have the same address does not change.

It is totally possible for two different unnamed llvm const allocations to get deduplicated.

I think that is only permissible if they have the same content.

@oli-obk
Copy link
Contributor

oli-obk commented May 20, 2020

Ah, I didn't get the subtle semantics involved in that example, even though now that I reread your post it's entirely clear.

So yea, let's do the GlobalAlloc::Dangling scheme and make codegen generate dummy allocations.

@RalfJung
Copy link
Member Author

I think the question that remains open is if that's worth it -- or if we should rather just continue failing early on dangling pointers.

@oli-obk
Copy link
Contributor

oli-obk commented May 27, 2020

We can just keep failing early and wait until someone has an actual use case

@RalfJung
Copy link
Member Author

RalfJung commented Jan 3, 2021

In #80418, quite a few additional test cases fail only because of this check -- and accepting them would likely just accept buggy programs.

@RalfJung
Copy link
Member Author

RalfJung commented May 8, 2024

We've had a hard error on dangling pointers for a while now, and I don't recall anyone ever asking for this to be allowed. People sometimes want &*(0x1200 as *const Type) to be allowed, but we haven't yet seen anyone who wants pointers to allocations that existed during const-eval but then were removed.

So, I'll close this issue. Having a hard error here is fine, IMO.

@RalfJung RalfJung closed this as completed May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-cleanup Category: PRs that clean code up or issues documenting cleanup.
Projects
None yet
Development

No branches or pull requests

2 participants