Iterators over slices of ZST behave strange #42789

RalfJung · 2017-06-21T03:26:36Z

The following test passes for slices over non-ZST, but fails for ZST:

fn slice_test<T>(x: &[T]) {
    let mut i = x.iter();
    i.next();
    let x1 = i.next().unwrap();
    let x2 = &x[1];
    let x3 = x.iter().nth(1).unwrap();
    assert_eq!(x1 as *const _, x2 as *const _);
    assert_eq!(x2 as *const _, x3 as *const _);
}

fn main() {
    slice_test(&[1,2,3]);
    slice_test(&[(),(),()]);
}

Playpen

While pointers to ZSTs don't really matter as they will never be dereferenced, I think providing consistent addresses for the same element would still make sense.
Yet, x2 turns out to be 0x55e742ee85dc, no idea where that address is coming from. The LLVM IR contains some undef, but I am not sure if that always indicates a problem.

Generally, the iterator's treatment of ZST's seems a little fishy. Below

rust/src/libcore/slice/mod.rs

Line 777 in 4450779

impl<T> SliceIndex<[T]> for usize {

, the get methods defer to .offset to compute the address of an element of the slice. offset is only well-defined for in-bounds accesses, yet it is used here on a dangling pointer-to-ZST, with some non-0 offset.

The text was updated successfully, but these errors were encountered:

arielb1 · 2017-06-21T08:27:33Z

To clarify, the way the test fails is that the slice iterator returns 0x1 instead of an actual pointer to an element.

arielb1 · 2017-06-21T08:35:19Z

Generally, the iterator's treatment of ZST's seems a little fishy. Below

rust/src/libcore/slice/mod.rs

Line 777 in 4450779

impl<T> SliceIndex<[T]> for usize {

, the get methods defer to .offset to compute the address of an element of the slice. offset is only well-defined for in-bounds accesses, yet it is sued here on a dangling pointer-to-ZST, with some non-0 offset.

By "dangling element" you mean 0x8: &mut Zst? By what I was able to read from LLVM's rules, a "valid pointer range" for the purpose of GEPi is just a range that:
A) does not start or contain null or ~0.
B) does not overlap with any "LLVM-known allocation" - aka globals, stack, malloc, etc.

In particular, LLVM does not try to know about all allocas in address space, it only cares about the ones it knows about, so 0x8 is a valid pointer.

RalfJung · 2017-06-21T17:03:37Z

To clarify, the way the test fails is that the slice iterator returns 0x1 instead of an actual pointer to an element.

IMHO that would be fine, if the get methods did the same. Things just should be consistent.

By "dangling element" you mean 0x8: &mut Zst?

0x1 here, but yeah.

oli-obk · 2017-06-21T17:38:40Z

We could treat any cast to a raw pointer to a zst as a constant producing an integer. Should be trivial to do as a Mir pass and has clear semantics imo.

RalfJung · 2017-06-21T21:02:06Z

This can't be done on polymorphic code though, can it?

strega-nil · 2017-06-21T22:58:27Z

@arielb1 Could you show where LLVM talks about treating ~0 specially?

bluss · 2018-01-07T20:49:22Z

I was going to fix this for the slice iterator in #46223, but I have yet to pick it up again (feel free to do that anyone).

RalfJung · 2018-05-30T07:10:12Z

Thanks to @rkruppe for pointing out that this is actually a soundness issue. The following module should never panic:

mod unique_token {
    use std::sync::atomic::{AtomicBool, Ordering};

    // With a private field so it can only be constructed inside this module
    pub struct Token(());
    
    // Stores whether we have already created a token
    static GUARD : AtomicBool = AtomicBool::new(false);
    
    impl Token {
        pub fn new() -> Option<Token> {
            if GUARD.compare_and_swap(false, true, Ordering::AcqRel) == false {
                Some(Token(()))
            } else {
                None
            }
        }
        
        pub fn take_two(a: &Token, b: &Token) {
            // There's only ever one token so these have to be the same
            assert_eq!(a as *const _, b as *const _,
                "How can there be two tokes?!?");
        }
    }
}

and yet here is how we can make it panic:

use unique_token::Token;

fn main() {
    let t = Token::new().expect("Couldn't even get a single token");
    // make sure we can't get a second
    assert!(Token::new().is_none());
    
    let s = &[t];
    let t1 = &s[0];
    let t2 = s.iter().next().expect("Slice can't be empty");
    Token::take_two(t1, t2);
}

Playground

arielb1 · 2018-07-07T21:58:08Z

@RalfJung

You can also get elements of slices through slice patterns, which don't squash ZST pointers to 0x1. You have a good argument for having slice iterators use the "real" address of the ZST instead of squashing it.

cc @rust-lang/libs - what's the obstacle to not squashing pointers in iterators?

arielb1 · 2018-07-07T21:58:36Z

Not sure whether we should tag this as I-unsound

RalfJung · 2018-07-09T08:15:38Z

The iterators also play fast-and-loose with the alignment rules. For example, Iter::as_slice will call from_raw_parts with an unaligned pointer when used with a type like [u32; 0].

RalfJung · 2018-07-12T08:31:22Z

I have a proposed fix at #52206.

slices: fix ZST slice iterators making up pointers; debug_assert alignment in from_raw_parts This fixes the problem that we are fabricating pointers out of thin air. I also managed to share more code between the mutable and shared iterators, while reducing the amount of macros. I am not sure how useful it really is to add a `debug_assert!` in libcore. Everybody gets a release version of that anyway, right? Is there at least a CI job that runs the test suite with a debug version? Fixes #42789

@eddyb

debug_assert to ensure that from_raw_parts is only used properly aligned This does not help nearly as much as I would hope because everybody uses the distributed libstd which is compiled without debug assertions. For this reason, I am not sure if this is even worth it. OTOH, this would have caught the misalignment fixed by #42789 *if* there had been any tests actually using ZSTs with alignment >1 (we have a CI runner which has debug assertions in libstd enabled), and it seems to currently [fail in the rg testsuite](https://ci.appveyor.com/project/rust-lang/rust/build/1.0.8403/job/v7dfdcgn8ay5j6sb). So maybe it is worth it, after all. I have seen the attribute `#[rustc_inherit_overflow_checks]` in some places, does that make it so that the *caller's* debug status is relevant? Is there a similar attribute for `debug_assert!`? That could even subsume `rustc_inherit_overflow_checks`: Something like `rustc_inherit_debug_flag` could affect *all* places that change the generated code depending on whether we are in debug or release mode. In fact, given that we have to keep around the MIR for generic functions anyway, is there ever a reason *not* to handle the debug flag that way? I guess currently we apply debug flags like `cfg` so this is dropped early during the MIR pipeline? EDIT: I learned from @eddyb that because of how `debug_assert!` works, this is not realistic. Well, we could still have it for the rustc CI runs and then maybe, eventually, when libstd gets compiled client-side and there is both a debug and a release build... then this will also benefit users.^^

@eddyb

debug_assert to ensure that from_raw_parts is only used properly aligned This does not help nearly as much as I would hope because everybody uses the distributed libstd which is compiled without debug assertions. For this reason, I am not sure if this is even worth it. OTOH, this would have caught the misalignment fixed by rust-lang/rust#42789 *if* there had been any tests actually using ZSTs with alignment >1 (we have a CI runner which has debug assertions in libstd enabled), and it seems to currently [fail in the rg testsuite](https://ci.appveyor.com/project/rust-lang/rust/build/1.0.8403/job/v7dfdcgn8ay5j6sb). So maybe it is worth it, after all. I have seen the attribute `#[rustc_inherit_overflow_checks]` in some places, does that make it so that the *caller's* debug status is relevant? Is there a similar attribute for `debug_assert!`? That could even subsume `rustc_inherit_overflow_checks`: Something like `rustc_inherit_debug_flag` could affect *all* places that change the generated code depending on whether we are in debug or release mode. In fact, given that we have to keep around the MIR for generic functions anyway, is there ever a reason *not* to handle the debug flag that way? I guess currently we apply debug flags like `cfg` so this is dropped early during the MIR pipeline? EDIT: I learned from @eddyb that because of how `debug_assert!` works, this is not realistic. Well, we could still have it for the rustc CI runs and then maybe, eventually, when libstd gets compiled client-side and there is both a debug and a release build... then this will also benefit users.^^

…r=the8472 Simplify the implementation of iterators over slices of ZSTs Currently, slice iterators over ZSTs store `end = start.wrapping_byte_add(len)`. That's slightly convenient for `is_empty`, but kinda annoying for pretty much everything else -- see bugs like rust-lang#42789, for example. This PR instead changes it to just `end = ptr::invalid(len)` instead. That's easier to think about (IMHO, at least) as well as easier to represent. `next` is still to big to get inlined into the mir-opt/pre-codegen/ tests, but if I bump the inline threshold to force it to show the whole thing, this implementation is also less MIR: ``` > git diff --numstat 241 370 tests/mir-opt/pre-codegen/slice_iter.forward_loop.PreCodegen.after.mir 255 329 tests/mir-opt/pre-codegen/slice_iter.reverse_loop.PreCodegen.after.mir 184 216 tests/mir-opt/pre-codegen/slice_iter.slice_iter_mut_next_back.PreCodegen.after.mir 182 254 tests/mir-opt/pre-codegen/slice_iter.slice_iter_next.PreCodegen.after.mir ``` (That's ≈70 lines less for `Iter::next`, for example.) r? `@ghost` ~~Built atop rust-lang#111282, so draft until that lands.~~

RalfJung mentioned this issue Jun 21, 2017

Clarify which inputs are permitted for the offset intrinsic (and the MIR binop) rust-lang/rust-memory-model#38

Closed

Mark-Simulacrum added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jun 23, 2017

Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 28, 2017

arielb1 added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Jul 7, 2018

RalfJung mentioned this issue Jul 10, 2018

slices: fix ZST slice iterators making up pointers; debug_assert alignment in from_raw_parts #52206

Merged

bors closed this as completed in #52206 Aug 2, 2018

RalfJung mentioned this issue Aug 2, 2018

debug_assert to ensure that from_raw_parts is only used properly aligned #52972

Merged

scottmcm mentioned this issue May 9, 2023

Simplify the implementation of iterators over slices of ZSTs #111395

Merged

celinval mentioned this issue Apr 8, 2024

Do not assume that ZST-typed symbols refer to unique objects model-checking/kani#3134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iterators over slices of ZST behave strange #42789

Iterators over slices of ZST behave strange #42789

RalfJung commented Jun 21, 2017 •

edited

Loading

arielb1 commented Jun 21, 2017

arielb1 commented Jun 21, 2017

RalfJung commented Jun 21, 2017

oli-obk commented Jun 21, 2017

RalfJung commented Jun 21, 2017

strega-nil commented Jun 21, 2017

bluss commented Jan 7, 2018

RalfJung commented May 30, 2018

arielb1 commented Jul 7, 2018

arielb1 commented Jul 7, 2018

RalfJung commented Jul 9, 2018

RalfJung commented Jul 12, 2018

Iterators over slices of ZST behave strange #42789

Iterators over slices of ZST behave strange #42789

Comments

RalfJung commented Jun 21, 2017 • edited Loading

arielb1 commented Jun 21, 2017

arielb1 commented Jun 21, 2017

RalfJung commented Jun 21, 2017

oli-obk commented Jun 21, 2017

RalfJung commented Jun 21, 2017

strega-nil commented Jun 21, 2017

bluss commented Jan 7, 2018

RalfJung commented May 30, 2018

arielb1 commented Jul 7, 2018

arielb1 commented Jul 7, 2018

RalfJung commented Jul 9, 2018

RalfJung commented Jul 12, 2018

RalfJung commented Jun 21, 2017 •

edited

Loading