Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

Clarify which inputs are permitted for the offset intrinsic (and the MIR binop) #38

Closed
RalfJung opened this issue Jun 6, 2017 · 10 comments

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 6, 2017

The offset intrinsic compiles to LLVM getelementpointer with inbounds attribute, so the offset must stay in-bounds (or one-past-the-end). However, rustc currently also relies on "offset by 0" to always be legal, even if the pointer passed as input is dangling:

  • In the IntoIterator impl for slices. If the slice is empty, the pointer will be (void*) alignment, but the iterator will still add the length (0) to it.
  • The MIR emitted as slice drop glue does the same, but uses the MIR binop instead of the intrinsic.

So the rules for offset currently seem to be:

  • The resulting pointer has to be a valid, in-bounds pointer or one-past-the-end, OR
  • the offset is 0.
@strega-nil
Copy link

strega-nil commented Jun 6, 2017

Note that, in the model I'm thinking about, any non-null pointer with the correct alignment should be valid for zero sized types.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2017

That's fair enough. We also have to be careful to emit the right code for LLVM. One consequence of this could be that in codegen, when we emit GEP, we don't set "inbounds" when we are working on a ZST.

@strega-nil
Copy link

ZSTs are weird, and I'm not sure what llvm would want from them. Having inbounds on a ZST pointer shouldn't be an issue though, because the point of inbounds (afaik) is to keep aliasing data, and ZSTs are impossible to alias.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2017

We should just better be sure LLVM doesn't derive UB from "you said this was inbounds, but I can prove it is not".

@strega-nil
Copy link

yeah, agreed.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 21, 2017

It turns out rustc stretches the rules even more: Below https://github.com/rust-lang/rust/blob/445077963c55297ef1e196a3525723090fe80b22/src/libcore/slice/mod.rs#L777, the get methods defer to .offset to compute the address of an element of the slice. This offsets a dangling-pointer-to-ZST by an arbitrary user-controlled amount.

No idea if LLVM permits this. Maybe offset is fine anyway if either the offset or the size is 0 (they are multiplied, after all).

@arielb1
Copy link
Contributor

arielb1 commented Jun 21, 2017

offset on what is llterally a ZST is a nop. We don't even have to lower it to LLVM if it turns out there's a problem.

@RalfJung
Copy link
Member Author

Also see @arielb1's comment at rust-lang/rust#42789 (comment)

@RalfJung
Copy link
Member Author

And another one: RawTable::raw_bucket_at will effectively call (0 as *mut u8).offset(0). The LLVM docs make it sound like that should be legal:

The only in bounds address for a null pointer in the default address-space is the null pointer itself.

@RalfJung
Copy link
Member Author

Closing in favor of younger issues: rust-lang/unsafe-code-guidelines#93, rust-lang/unsafe-code-guidelines#299

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants