Various fixes and improvements to the MIR Dataflow framework #1

gereeter · 2016-05-16T12:05:20Z

Since you mentioned that you were too busy to work much on your PR, I thought I would do some work myself.

…ures

…value instead of by type

…ltiple statements

…tching the rest of rustc

nagisa · 2016-05-16T12:11:14Z

src/librustc/mir/transform/dataflow.rs


    /// The transfer function which given a terminator and a fact produces a fact for each
    /// successor of the terminator.
    ///
    /// Corectness precondtition:
    /// * The list of facts produced should only contain the facts for blocks which are successors
    /// of the terminator being transfered.
-    fn term(&mir::Terminator<'tcx>, L) -> Self::TerminatorOut;
+    fn term(&self, &mir::Terminator<'tcx>, Self::Lattice) -> Vec<Self::Lattice>;


This change makes the trait not very reusable for backward analysis, where terminators only have a single edge regardless of how many edges the terminator has in the forward direction. That is the primary motivation for TerminatorOut associated type.

True, but I'm not sure that the same trait should be reused for the forward and backwards cases. Regardless, even if the two directions share a trait, I'd prefer a clearer way of denoting which direction a pass goes than using Vec<Lattice> for forward and Lattice for backwards. Transfer could have another argument (Direction) which itself has an associated type denoting what output term should have.

I got rid of the associated TerminatorOut because it felt clumsy, it cause recursive trait bounds when I moved in the associated lattice type, and because it wasn't necessary yet. When backwards transformation gets implemented, I'd be happy for this to be generalized.

nagisa · 2016-05-16T12:14:42Z

I will review and see what I can merge from this thing in more detail (likely, much) later.

Thanks for the PR!

…te to a projected lvalue, or through the destination of a call. Projected lvalues are now entirely not tracked. With --enable-orbit, everything bootstraps except for denied warnings.

…t warning

gereeter · 2016-06-06T21:33:43Z

I fixed a bunch of correctness bugs, particularly in the alias side of things. However, I'm still not getting a successful bootstrap including AliasRewrite, and anyway I believe it could benefit from very different data structures, so I think it could use a clean rewrite.

Additionally, enough MIR changes have gone in that I'm planning to rebase my branch on master. Would it be easier if I just made my own rust-lang/rust PR, especially if I got it to bootstrap?

nagisa · 2016-06-06T22:08:04Z

Sure, but I still have a bunch of things I want to do with it, which is why the original PR was [WIP].

nagisa · 2016-06-06T22:12:24Z

src/librustc_mir/transform/acs_propagate.rs

-    fn bottom() -> Self { unimplemented!() }
+#[derive(Debug, Clone)]
+enum AcsLattice<'tcx> {
+    Bottom,


The bottom element in this lattice is an empty HashMap. It is fine to have it this way, because HashMap::new() does not allocate, to my knowledge.

Similarly, I’m confused as to why the Top was removed from Either. I do not see how you could merge two different constants together and produce anything other than a Top.

/me shrugs

I changed the lattice so that HashMap::new() is actually the top element of the lattice - I treat empty elements as top instead of bottom. You'll notice that join now does intersection instead of union. This is primarily because when we encounter an unknown function call, we need to mark everything as Top, because the function call could change anything. This situation can be improved with alias analysis and information about what the function might do, but the default needs to be Top. Similarly, when we enter the function, everything needs to be Top. Consider the following case:

fn foo(mut x: u32) { if /* random unoptimizable condition */ { x = 5; } println!("{}", x); }

Previously, since the value stored for x was bottom, we would merge the fact bottom and the fact x = 5 at the end of the if statement, concluding that x was always 5. We would optimize to:

fn foo(mut x: u32) { if /* random unoptimizable condition */ { x = 5; } println!("{}", 5); }

which is just wrong.

Note that, in terms of inspiration, while the original paper used a union-based lattice for constant propogation, GHC actually uses an intersection based lattice (see here).

Also, I realized after I wrote that code that WBottom existed and that it would have been better to use that. However, I'm still planning to remove the requirement that lattices have bottom, again inspired by https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup, and so I didn't bother to switch to a cleaner version.

gereeter · 2016-06-06T22:15:09Z

Yeah, I figured so - it just seemed like a simpler way of keeping track of changes.

rustc_trans: don't Assert(Overflow(Neg)) when overflow checks are off. Generic functions using `Neg` on primitive types would panic even in release mode, with MIR trans. The solution is a bit hacky, as I'm checking the message, since there's no dedicated `CheckedUnOp`. Blocks Servo rustup ([failure #1](http://build.servo.org/builders/linux-rel/builds/2477/steps/test_3/logs/stdio), [failure rust-lang#2](http://build.servo.org/builders/mac-rel-css/builds/2364/steps/test/logs/stdio)) - this should be the last hurdle, it affects only one test.

For a given file ```rust trait A { fn foo(&self) {} } trait B : A { fn foo(&self) {} } fn bar<T: B>(a: &T) { a.foo() } ``` provide the following output ``` error[E0034]: multiple applicable items in scope --> file.rs:6:5 | 6 | a.foo(1) | ^^^ multiple `foo` found | note: candidate #1 is defined in the trait `A` --> file.rs:2:11 | 2 | trait A { fn foo(&self, a: usize) {} } | ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: to use it here write `A::foo(&a, 1)` instead --> file.rs:6:5 | 6 | a.foo(1) | ^^^ note: candidate rust-lang#2 is defined in the trait `B` --> file.rs:3:15 | 3 | trait B : A { fn foo(&self, a: usize) {} } | ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: to use it here write `B::foo(&a, 1)` instead --> file.rs:6:5 | 6 | a.foo(1) | ^^^ ```

E0034: provide disambiguated syntax for candidates For a given file ```rust trait A { fn foo(&self) {} } trait B : A { fn foo(&self) {} } fn bar<T: B>(a: &T) { a.foo() } ``` provide the following output ``` error[E0034]: multiple applicable items in scope --> file.rs:6:5 | 6 | a.foo(1) | ^^^ multiple `foo` found | note: candidate #1 is defined in the trait `A` --> file.rs:2:11 | 2 | trait A { fn foo(&self, a: usize) {} } | ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: to use it here write `A::foo(&a, 1)` instead --> file.rs:6:5 | 6 | a.foo(1) | ^^^ note: candidate rust-lang#2 is defined in the trait `B` --> file.rs:3:15 | 3 | trait B : A { fn foo(&self, a: usize) {} } | ^^^^^^^^^^^^^^^^^^^^^^^^^^ help: to use it here write `B::foo(&a, 1)` instead --> file.rs:6:5 | 6 | a.foo(1) | ^^^ ``` Fix rust-lang#37767.

LeakSanitizer, ThreadSanitizer, AddressSanitizer and MemorySanitizer support ``` $ cargo new --bin leak && cd $_ $ edit Cargo.toml && tail -n3 $_ ``` ``` toml [profile.dev] opt-level = 1 ``` ``` $ edit src/main.rs && cat $_ ``` ``` rust use std::mem; fn main() { let xs = vec![0, 1, 2, 3]; mem::forget(xs); } ``` ``` $ RUSTFLAGS="-Z sanitizer=leak" cargo run --target x86_64-unknown-linux-gnu; echo $? Finished dev [optimized + debuginfo] target(s) in 0.0 secs Running `target/debug/leak` ================================================================= ==10848==ERROR: LeakSanitizer: detected memory leaks Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x557c3488db1f in __interceptor_malloc /shared/rust/checkouts/lsan/src/compiler-rt/lib/lsan/lsan_interceptors.cc:55 #1 0x557c34888aaa in alloc::heap::exchange_malloc::h68f3f8b376a0da42 /shared/rust/checkouts/lsan/src/liballoc/heap.rs:138 rust-lang#2 0x557c34888afc in leak::main::hc56ab767de6d653a $PWD/src/main.rs:4 rust-lang#3 0x557c348c0806 in __rust_maybe_catch_panic ($PWD/target/debug/leak+0x3d806) SUMMARY: LeakSanitizer: 16 byte(s) leaked in 1 allocation(s). 23 ``` ``` $ cargo new --bin racy && cd $_ $ edit src/main.rs && cat $_ ``` ``` rust use std::thread; static mut ANSWER: i32 = 0; fn main() { let t1 = thread::spawn(|| unsafe { ANSWER = 42 }); unsafe { ANSWER = 24; } t1.join().ok(); } ``` ``` $ RUSTFLAGS="-Z sanitizer=thread" cargo run --target x86_64-unknown-linux-gnu; echo $? ================== WARNING: ThreadSanitizer: data race (pid=12019) Write of size 4 at 0x562105989bb4 by thread T1: #0 racy::main::_$u7b$$u7b$closure$u7d$$u7d$::hbe13ea9e8ac73f7e $PWD/src/main.rs:6 (racy+0x000000010e3f) #1 _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h2e466a92accacc78 /shared/rust/checkouts/lsan/src/libstd/panic.rs:296 (racy+0x000000010cc5) rust-lang#2 std::panicking::try::do_call::h7f4d2b38069e4042 /shared/rust/checkouts/lsan/src/libstd/panicking.rs:460 (racy+0x00000000c8f2) rust-lang#3 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56) rust-lang#4 std::panic::catch_unwind::h31ca45621ad66d5a /shared/rust/checkouts/lsan/src/libstd/panic.rs:361 (racy+0x00000000b517) rust-lang#5 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::hccfc37175dea0b01 /shared/rust/checkouts/lsan/src/libstd/thread/mod.rs:357 (racy+0x00000000c226) rust-lang#6 _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hd880bbf91561e033 /shared/rust/checkouts/lsan/src/liballoc/boxed.rs:605 (racy+0x00000000f27e) rust-lang#7 std::sys::imp::thread::Thread::new::thread_start::hebdfc4b3d17afc85 <null> (racy+0x0000000abd40) Previous write of size 4 at 0x562105989bb4 by main thread: #0 racy::main::h23e6e5ca46d085c3 $PWD/src/main.rs:8 (racy+0x000000010d7c) #1 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56) rust-lang#2 __libc_start_main <null> (libc.so.6+0x000000020290) Location is global 'racy::ANSWER::h543d2b139f819b19' of size 4 at 0x562105989bb4 (racy+0x0000002f8bb4) Thread T1 (tid=12028, running) created by main thread at: #0 pthread_create /shared/rust/checkouts/lsan/src/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (racy+0x00000001aedb) #1 std::sys::imp::thread::Thread::new::hce44187bf4a36222 <null> (racy+0x0000000ab9ae) rust-lang#2 std::thread::spawn::he382608373eb667e /shared/rust/checkouts/lsan/src/libstd/thread/mod.rs:412 (racy+0x00000000b5aa) rust-lang#3 racy::main::h23e6e5ca46d085c3 $PWD/src/main.rs:6 (racy+0x000000010d5c) rust-lang#4 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56) rust-lang#5 __libc_start_main <null> (libc.so.6+0x000000020290) SUMMARY: ThreadSanitizer: data race $PWD/src/main.rs:6 in racy::main::_$u7b$$u7b$closure$u7d$$u7d$::hbe13ea9e8ac73f7e ================== ThreadSanitizer: reported 1 warnings 66 ``` ``` $ cargo new --bin oob && cd $_ $ edit src/main.rs && cat $_ ``` ``` rust fn main() { let xs = [0, 1, 2, 3]; let y = unsafe { *xs.as_ptr().offset(4) }; } ``` ``` $ RUSTFLAGS="-Z sanitizer=address" cargo run --target x86_64-unknown-linux-gnu; echo $? ================================================================= ==13328==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff29f3ecd0 at pc 0x55802dc6bf7e bp 0x7fff29f3ec90 sp 0x7fff29f3ec88 READ of size 4 at 0x7fff29f3ecd0 thread T0 #0 0x55802dc6bf7d in oob::main::h0adc7b67e5feb2e7 $PWD/src/main.rs:3 #1 0x55802dd60426 in __rust_maybe_catch_panic ($PWD/target/debug/oob+0xfe426) rust-lang#2 0x55802dd58dd9 in std::rt::lang_start::hb2951fc8a59d62a7 ($PWD/target/debug/oob+0xf6dd9) rust-lang#3 0x55802dc6c002 in main ($PWD/target/debug/oob+0xa002) rust-lang#4 0x7fad8c3b3290 in __libc_start_main (/usr/lib/libc.so.6+0x20290) rust-lang#5 0x55802dc6b719 in _start ($PWD/target/debug/oob+0x9719) Address 0x7fff29f3ecd0 is located in stack of thread T0 at offset 48 in frame #0 0x55802dc6bd5f in oob::main::h0adc7b67e5feb2e7 $PWD/src/main.rs:1 This frame has 1 object(s): [32, 48) 'xs' <== Memory access at offset 48 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow $PWD/src/main.rs:3 in oob::main::h0adc7b67e5feb2e7 Shadow bytes around the buggy address: 0x1000653dfd40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfd50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfd60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfd70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x1000653dfd90: 00 00 00 00 f1 f1 f1 f1 00 00[f3]f3 00 00 00 00 0x1000653dfda0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfdb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfdc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfdd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x1000653dfde0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==13328==ABORTING 1 ``` ``` $ cargo new --bin uninit && cd $_ $ edit src/main.rs && cat $_ ``` ``` rust use std::mem; fn main() { let xs: [u8; 4] = unsafe { mem::uninitialized() }; let y = xs[0] + xs[1]; } ``` ``` $ RUSTFLAGS="-Z sanitizer=memory" cargo run; echo $? ==30198==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x563f4b6867da in uninit::main::hc2731cd4f2ed48f8 $PWD/src/main.rs:5 #1 0x563f4b7033b6 in __rust_maybe_catch_panic ($PWD/target/debug/uninit+0x873b6) rust-lang#2 0x563f4b6fbd69 in std::rt::lang_start::hb2951fc8a59d62a7 ($PWD/target/debug/uninit+0x7fd69) rust-lang#3 0x563f4b6868a9 in main ($PWD/target/debug/uninit+0xa8a9) rust-lang#4 0x7fe844354290 in __libc_start_main (/usr/lib/libc.so.6+0x20290) rust-lang#5 0x563f4b6864f9 in _start ($PWD/target/debug/uninit+0xa4f9) SUMMARY: MemorySanitizer: use-of-uninitialized-value $PWD/src/main.rs:5 in uninit::main::hc2731cd4f2ed48f8 Exiting 77 ```

Group "missing variable bind" spans in `or` matches and clarify wording for the two possible cases: when a variable from the first pattern is not in any of the subsequent patterns, and when a variable in any of the other patterns is not in the first one. Before: ``` error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#2 --> file.rs:10:23 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^^^^ pattern doesn't bind `a` error[E0408]: variable `b` from pattern rust-lang#2 is not bound in pattern #1 --> file.rs:10:32 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^ pattern doesn't bind `b` error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#3 --> file.rs:10:37 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `a` error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#3 --> file.rs:10:37 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `d` error[E0408]: variable `c` from pattern rust-lang#3 is not bound in pattern #1 --> file.rs:10:43 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^ pattern doesn't bind `c` error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#4 --> file.rs:10:48 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `d` error: aborting due to 6 previous errors ``` After: ``` error[E0408]: variable `a` is not bound in all patterns --> file.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { intln!("{:?}", a); } | - ^^^^^^^^^^^ ^^^^^^^^ - variable t in all patterns | | | | | | | pattern doesn't bind `a` | | pattern doesn't bind `a` | variable not in all patterns error[E0408]: variable `d` is not bound in all patterns --> file.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { intln!("{:?}", a); } | - - ^^^^^^^^ ^^^^^^^^ pattern esn't bind `d` | | | | | | | pattern doesn't bind `d` | | variable not in all patterns | variable not in all patterns error[E0408]: variable `b` is not bound in all patterns --> file.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { intln!("{:?}", a); } | ^^^^^^^^^^^ - ^^^^^^^^ ^^^^^^^^ pattern esn't bind `b` | | | | | | | pattern doesn't bind `b` | | variable not in all patterns | pattern doesn't bind `b` error[E0408]: variable `c` is not bound in all patterns --> file.rs:20:48 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { intln!("{:?}", a); } | ^^^^^^^^^^^ ^^^^^^^^^^^ - ^^^^^^^^ pattern esn't bind `c` | | | | | | | variable not in all tterns | | pattern doesn't bind `c` | pattern doesn't bind `c` error: aborting due to 4 previous errors ``` * Have only one presentation for binding consistency errors * Point to same binding in multiple patterns when possible * Check inconsistent bindings in all arms * Simplify wording of diagnostic message * Sort emition and spans of binding errors for deterministic output

Clean up "pattern doesn't bind x" messages Group "missing variable bind" spans in `or` matches and clarify wording for the two possible cases: when a variable from the first pattern is not in any of the subsequent patterns, and when a variable in any of the other patterns is not in the first one. Before: ```rust error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#2 --> file.rs:10:23 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^^^^ pattern doesn't bind `a` error[E0408]: variable `b` from pattern rust-lang#2 is not bound in pattern #1 --> file.rs:10:32 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^ pattern doesn't bind `b` error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#3 --> file.rs:10:37 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `a` error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#3 --> file.rs:10:37 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `d` error[E0408]: variable `c` from pattern rust-lang#3 is not bound in pattern #1 --> file.rs:10:43 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^ pattern doesn't bind `c` error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#4 --> file.rs:10:48 | 10 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^ pattern doesn't bind `d` error: aborting due to 6 previous errors ``` After: ```rust error[E0408]: variable `d` is not bound in all patterns --> $DIR/issue-39698.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | - - ^^^^^^^^ ^^^^^^^^ pattern doesn't bind `d` | | | | | | | pattern doesn't bind `d` | | variable not in all patterns | variable not in all patterns error[E0408]: variable `c` is not bound in all patterns --> $DIR/issue-39698.rs:20:48 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^^^^ ^^^^^^^^^^^ - ^^^^^^^^ pattern doesn't bind `c` | | | | | | | variable not in all patterns | | pattern doesn't bind `c` | pattern doesn't bind `c` error[E0408]: variable `a` is not bound in all patterns --> $DIR/issue-39698.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | - ^^^^^^^^^^^ ^^^^^^^^ - variable not in all patterns | | | | | | | pattern doesn't bind `a` | | pattern doesn't bind `a` | variable not in all patterns error[E0408]: variable `b` is not bound in all patterns --> $DIR/issue-39698.rs:20:37 | 20 | T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); } | ^^^^^^^^^^^ - ^^^^^^^^ ^^^^^^^^ pattern doesn't bind `b` | | | | | | | pattern doesn't bind `b` | | variable not in all patterns | pattern doesn't bind `b` error: aborting due to 4 previous errors ``` Fixes rust-lang#39698.

Without that flag, LLVM generates unaligned memory access instructions, which are not allowed on ARMv5. For example, the 'hello world' example from `cargo --new` failed with: ``` $ ./hello Hello, world! thread 'main' panicked at 'assertion failed: end <= len', src/libcollections/vec.rs:1113 note: Run with `RUST_BACKTRACE=1` for a backtrace. ``` I traced this error back to the following assembler code in `BufWriter::flush_buf`: ``` 6f44: e28d0018 add r0, sp, rust-lang#24 [...] 6f54: e280b005 add fp, r0, rust-lang#5 [...] 7018: e5cd001c strb r0, [sp, rust-lang#28] 701c: e1a0082a lsr r0, sl, rust-lang#16 7020: 03a01001 moveq r1, #1 7024: e5cb0002 strb r0, [fp, rust-lang#2] 7028: e1cba0b0 strh sl, [fp] ``` Note that `fp` points to `sp + 29`, so the three `str*`-instructions should fill up a 32bit - value at `sp + 28`, which is later used as the value `n` in `Ok(n) => written += n`. This doesn't work on ARMv5 as the `strh` can't write to the unaligned contents of `fp`, so the upper bits of `n` won't get cleared, leading to the assertion failure in Vec::drain. With `+strict-align`, the code works as expected.

ARMv5 needs +strict-align Without that flag, LLVM generates unaligned memory access instructions, which are not allowed on ARMv5. For example, the 'hello world' example from `cargo --new` failed with: ``` $ ./hello Hello, world! thread 'main' panicked at 'assertion failed: end <= len', src/libcollections/vec.rs:1113 note: Run with `RUST_BACKTRACE=1` for a backtrace. ``` I traced this error back to the following assembler code in `BufWriter::flush_buf`: ``` 6f44: e28d0018 add r0, sp, rust-lang#24 [...] 6f54: e280b005 add fp, r0, rust-lang#5 [...] 7018: e5cd001c strb r0, [sp, rust-lang#28] 701c: e1a0082a lsr r0, sl, rust-lang#16 7020: 03a01001 moveq r1, #1 7024: e5cb0002 strb r0, [fp, rust-lang#2] 7028: e1cba0b0 strh sl, [fp] ``` Note that `fp` points to `sp + 29`, so the three `str*`-instructions should fill up a 32bit - value at `sp + 28`, which is later used as the value `n` in `Ok(n) => written += n`. This doesn't work on ARMv5 as the `strh` can't write to the unaligned contents of `fp`, so the upper bits of `n` won't get cleared, leading to the assertion failure in Vec::drain. With `+strict-align`, the code works as expected.

Merge upstream changes

Building for x86_64-unknown-linux-musl currently results in an executable lacking debug information for musl libc itself. If you request a backtrace in GDB while control flow is within musl – including sycalls made by musl – the result looks like: #0 0x0000000000434b46 in __cp_end () #1 0x0000000000432dbd in __syscall_cp_c () rust-lang#2 0x0000000000000000 in ?? () i.e. not very helpful. Adding --enable-debug resolves this, and --enable-optimize re-enables optimisations which default to off given the previous flag.

Add --enable-debug flag to musl CI build script Building for x86_64-unknown-linux-musl currently results in an executable lacking debug information for musl libc itself. If you request a backtrace in GDB while control flow is within musl – including sycalls made by musl – the result looks like: ``` #0 0x0000000000434b46 in __cp_end () #1 0x0000000000432dbd in __syscall_cp_c () rust-lang#2 0x0000000000000000 in ?? () ``` i.e. not very helpful. Adding --enable-debug resolves this, and --enable-optimize re-enables optimisations which default to off given the previous flag.

There is a hot path through `opt_normalize_projection_type`: - `try_start` does a cache lookup (#1). - The result is a `NormalizedTy`. - There are no unresolved type vars, so we call `complete`. - `complete` does *another* cache lookup (rust-lang#2), then calls `SnapshotMap::insert`. - `insert` does *another* cache lookup (rust-lang#3), inserting the same value that's already in the cache. This patch optimizes this hot path by introducing `complete_normalized`, for use when the value is known in advance to be a `NormalizedTy`. It always avoids lookup rust-lang#2. Furthermore, if the `NormalizedTy`'s obligations are empty (the common case), we know that lookup rust-lang#3 would be a no-op, so we avoid it, while inserting a Noop into the `SnapshotMap`'s undo log.

When encountering an unexisting method for a given trait where an associated function has the same name, suggest using the appropriate syntax, instead of using `help` text. When only one candidate is found, do not call it "candidate #1", just call it "the candidate".

Tweak output on E0599 for assoc fn used as method - Use suggestion instead of `help` when possible - Add primary span label - Remove incorrect `help` suggestion using incorrect syntax - Do not refer to only one possible candidate as `candidate #1`, refer to it as `the candidate`

Suggestion for 'static impl Trait return When encountering a named or anonymous sup requirement (for example, `&'a self`) and a `'static` impl Trait return type, suggest adding the `'_` lifetime constraing to the return type. Fix rust-lang#43719, rust-lang#51282. ``` error: cannot infer an appropriate lifetime --> $DIR/static-return-lifetime-infered.rs:17:16 | LL | fn iter_values_anon(&self) -> impl Iterator<Item=u32> { | ----------------------- this return type evaluates to the `'static` lifetime... LL | self.x.iter().map(|a| a.0) | ------ ^^^^ | | | ...but this borrow... | note: ...can't outlive the anonymous lifetime #1 defined on the method body at 16:5 --> $DIR/static-return-lifetime-infered.rs:16:5 | LL | / fn iter_values_anon(&self) -> impl Iterator<Item=u32> { LL | | self.x.iter().map(|a| a.0) LL | | } | |_____^ help: you can add a constraint to the return type to make it last less than `'static` and match the anonymous lifetime #1 defined on the method body at 16:5 | LL | fn iter_values_anon(&self) -> impl Iterator<Item=u32> + '_ { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ```

@matthewjasper

…-box, r=eddyb [NLL] Dangly paths for box Special-case `Box` in `rustc_mir::borrow_check`. Since we know dropping a box will not access any `&mut` or `&` references, it is safe to model its destructor as only touching the contents *owned* by the box. ---- There are three main things going on here: 1. The first main thing, this PR is fixing a bug in NLL where `rustc` previously would issue a diagnostic error in a case like this: ```rust fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x } ``` such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message ([playground](https://play.rust-lang.org/?gist=13c5560f73bfb16d6dab3ceaad44c0f8&version=nightly&mode=release&edition=2015)) ``` error[E0597]: `**x` does not live long enough --> src/main.rs:3:40 | 3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x } | ^^^^^^^^ - `**x` dropped here while still borrowed | | | borrowed value does not live long enough | note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 3:1... --> src/main.rs:3:1 | 3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x } | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: aborting due to previous error ``` 2. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue rust-lang#31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with `Scribble` model this, showing that the compiler now catches such unsoundness. However, that fix for issue rust-lang#31567 is too strong, in that NLL (MIR-borrowck) includes `Box` as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue rust-lang#45696, specifically in [the last example of this comment](rust-lang#45696 (comment)), which I have adapted into the `fn foo` shown above. We did close issue rust-lang#45696 back in December of 2017, but AFAICT that example was not fixed by PR rust-lang#46268. (And we did not include a test, etc etc.) This PR fixes that case, by trying to model the so-called `DerefPure` semantics of `Box<T>` when we traverse the type of the input to `visit_terminator_drop`. 3. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of `Box<T>` could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like `struct A(Box<A>);`, something which cannot be constructed in practice.) Fix rust-lang#45696.

fix typos

sync fork with upstream (master)

Explain safety for `vec.set_len(0)`

Use arenas to avoid Lrc in queries #1 Based on rust-lang#59536.

update from upstream

`stride == 1` case can be computed more efficiently through `-p (mod a)`. That, then translates to a nice and short sequence of LLVM instructions: %address = ptrtoint i8* %p to i64 %negptr = sub i64 0, %address %offset = and i64 %negptr, %a_minus_one And produces pretty much ideal code-gen when this function is used in isolation. Typical use of this function will, however, involve use of the result to offset a pointer, i.e. %aligned = getelementptr inbounds i8, i8* %p, i64 %offset This still looks very good, but LLVM does not really translate that to what would be considered ideal machine code (on any target). For example that's the codegen we obtain for an unknown alignment: ; x86_64 dec rsi mov rax, rdi neg rax and rax, rsi add rax, rdi In particular negating a pointer is not something that’s going to be optimised for in the design of CISC architectures like x86_64. They are much better at offsetting pointers. And so we’d love to utilize this ability and produce code that's more like this: ; x86_64 lea rax, [rsi + rdi - 1] neg rsi and rax, rsi To achieve this we need to give LLVM an opportunity to apply its various peep-hole optimisations that it does during DAG selection. In particular, the `and` instruction appears to be a major inhibitor here. We cannot, sadly, get rid of this load-bearing operation, but we can reorder operations such that LLVM has more to work with around this instruction. One such ordering is proposed in rust-lang#75579 and results in LLVM IR that looks broadly like this: ; using add enables `lea` and similar CISCisms %offset_ptr = add i64 %address, %a_minus_one %mask = sub i64 0, %a %masked = and i64 %offset_ptr, %mask ; can be folded with `gepi` that may follow %offset = sub i64 %masked, %address …and generates the intended x86_64 machine code. One might also wonder how the increased amount of code would impact a RISC target. Turns out not much: ; aarch64 previous ; aarch64 new sub x8, x1, #1 add x8, x1, x0 neg x9, x0 sub x8, x8, #1 and x8, x9, x8 neg x9, x1 add x0, x0, x8 and x0, x8, x9 (and similarly for ppc, sparc, mips, riscv, etc) The only target that seems to do worse is… wasm32. Onto actual measurements – the best way to evaluate snippets like these is to use llvm-mca. Much like Aarch64 assembly would allow to suspect, there isn’t any performance difference to be found. Both snippets execute in same number of cycles for the CPUs I tried. On x86_64, we get throughput improvement of >50%, however!

HWAddressSanitizer support # Motivation Compared to regular ASan, HWASan has a [smaller overhead](https://source.android.com/devices/tech/debug/hwasan). The difference in practice is that HWASan'ed code is more usable, e.g. Android device compiled with HWASan can be used as a daily driver. # Example ``` fn main() { let xs = vec![0, 1, 2, 3]; let _y = unsafe { *xs.as_ptr().offset(4) }; } ``` ``` ==223==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdeffff0050 at pc 0xaaaad00b3468 READ of size 4 at 0xefdeffff0050 tags: e5/00 (ptr/mem) in thread T0 #0 0xaaaad00b3464 (/root/main+0x53464) #1 0xaaaad00b39b4 (/root/main+0x539b4) rust-lang#2 0xaaaad00b3dd0 (/root/main+0x53dd0) rust-lang#3 0xaaaad00b61dc (/root/main+0x561dc) rust-lang#4 0xaaaad00c0574 (/root/main+0x60574) rust-lang#5 0xaaaad00b6290 (/root/main+0x56290) rust-lang#6 0xaaaad00b6170 (/root/main+0x56170) rust-lang#7 0xaaaad00b3578 (/root/main+0x53578) rust-lang#8 0xffff81345e70 (/lib64/libc.so.6+0x20e70) rust-lang#9 0xaaaad0096310 (/root/main+0x36310) [0xefdeffff0040,0xefdeffff0060) is a small allocated heap chunk; size: 32 offset: 16 0xefdeffff0050 is located 0 bytes to the right of 16-byte region [0xefdeffff0040,0xefdeffff0050) allocated here: #0 0xaaaad009bcdc (/root/main+0x3bcdc) #1 0xaaaad00b1eb0 (/root/main+0x51eb0) rust-lang#2 0xaaaad00b20d4 (/root/main+0x520d4) rust-lang#3 0xaaaad00b2800 (/root/main+0x52800) rust-lang#4 0xaaaad00b1cf4 (/root/main+0x51cf4) rust-lang#5 0xaaaad00b33d4 (/root/main+0x533d4) rust-lang#6 0xaaaad00b39b4 (/root/main+0x539b4) rust-lang#7 0xaaaad00b61dc (/root/main+0x561dc) rust-lang#8 0xaaaad00b3578 (/root/main+0x53578) rust-lang#9 0xaaaad0096310 (/root/main+0x36310) Thread: T0 0xeffe00002000 stack: [0xffffc0590000,0xffffc0d90000) sz: 8388608 tls: [0xffff81521020,0xffff815217d0) Memory tags around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffef80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffef90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffefe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefcefffeff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0xfefceffff000: a2 a2 05 00 e5 [00] 00 00 00 00 00 00 00 00 00 00 0xfefceffff010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0xfefceffff080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Tags for short granules around the buggy address (one tag corresponds to 16 bytes): 0xfefcefffeff0: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. =>0xfefceffff000: .. .. c5 .. .. [..] .. .. .. .. .. .. .. .. .. .. 0xfefceffff010: .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags Registers where the failure occurred (pc 0xaaaad00b3468): x0 e500efdeffff0050 x1 0000000000000004 x2 0000ffffc0d8f5a0 x3 0200efff00000000 x4 0000ffffc0d8f4c0 x5 000000000000004f x6 00000ffffc0d8f36 x7 0000efff00000000 x8 e500efdeffff0050 x9 0200efff00000000 x10 0000000000000000 x11 0200efff00000000 x12 0200effe000006b0 x13 0200effe000006b0 x14 0000000000000008 x15 00000000c00000cf x16 0000aaaad00a0afc x17 0000000000000003 x18 0000000000000001 x19 0000ffffc0d8f718 x20 ba00ffffc0d8f7a0 x21 0000aaaad00962e0 x22 0000000000000000 x23 0000000000000000 x24 0000000000000000 x25 0000000000000000 x26 0000000000000000 x27 0000000000000000 x28 0000000000000000 x29 0000ffffc0d8f650 x30 0000aaaad00b3468 ``` # Comments/Caveats * HWASan is only supported on arm64. * I'm not sure if I should add a feature gate or piggyback on the existing one for sanitizers. * HWASan requires `-C target-feature=+tagged-globals`. That flag should probably be set transparently to the user. Not sure how to go about that. # TODO * Need more tests. * Update documentation. * Fix symbolization. * Integrate with CI

bypass auto_da_alloc for metadata files This saves about 0.7% when rerunning the UI test suite. I.e. when the metadata files exist and will be overwritten. No improvements expected for a clean build. So it might show up in incr-patched perf results. ``` regular rename: Benchmark #1: touch src/tools/compiletest/src/main.rs ; RUSTC_WRAPPER="" schedtool -B -e ./x.py test src/test/ui Time (mean ± σ): 47.305 s ± 0.170 s [User: 1631.540 s, System: 412.648 s] Range (min … max): 47.125 s … 47.856 s 20 runs non-durable rename: Benchmark #1: touch src/tools/compiletest/src/main.rs ; RUSTC_WRAPPER="" schedtool -B -e ./x.py test src/test/ui Time (mean ± σ): 46.930 s ± 0.064 s [User: 1634.344 s, System: 396.038 s] Range (min … max): 46.759 s … 47.043 s 20 runs ``` There are more places that trigger auto_da_alloc behavior by overwriting existing files with O_TRUNC, but those are much harder to locate because `O_TRUNC` is set on `open()` but the writeback is triggered on `close()`. The latter is the part which shows up in profiles.

Otherwise, we can get into a situation where you have a subtype obligation `#1 <: rust-lang#2` pending, #1 is constrained by `check_casts`, but rust-lang#2` is unaffected. Co-authored-by: Niko Matsakis <niko@alum.mit.edu>

gereeter added 6 commits May 16, 2016 05:46

Remove use of specialization in Lattice to avoid use of unstable feat…

d4b132a

…ures

Remove DataflowPass

8fb7a1c

Add &self arguments to Transfer and Rewrite and start taking them by …

9a46cd7

…value instead of by type

Remove unused and buggy support for dataflow passes that introduce mu…

ca08fdf

…ltiple statements

Let ar_forward generate its own queue

c36dc32

Change the fully capitalized ACS to the partially capitalized Acs, ma…

ca52586

…tching the rest of rustc

nagisa reviewed May 16, 2016
View reviewed changes

nagisa mentioned this pull request May 16, 2016

[WIP][MIR] Generic lattice-based DF framework rust-lang/rust#33628

Closed

gereeter added 8 commits May 22, 2016 10:29

Fix infinite loop in SimplifyCfg and re-enable it

a1cf39a

Fix various nits in MIR Dataflow

e2d48c5

Actually rewrite constants in AcsPropagate

50697a5

Remove some unnecessary pubs in AcsPropagate

a40d9c1

Invalidate values in AcsLattice that are overwritten, either by a wri…

52d150f

…te to a projected lvalue, or through the destination of a call. Projected lvalues are now entirely not tracked. With --enable-orbit, everything bootstraps except for denied warnings.

Temporarily completely disable Backward dataflow to fix unused varian…

92628f7

…t warning

Rewrite AcsLattice to be more correct

ab1d316

Further correctness improvements to AcsRewrite

d5c74f8

nagisa reviewed Jun 6, 2016
View reviewed changes

nagisa pushed a commit that referenced this pull request Feb 28, 2017

Fix tests on Windows (#1)

227eab0

nagisa pushed a commit that referenced this pull request Apr 18, 2018

Merge pull request #1 from rust-lang/master

5e94d54

Merge upstream changes

nagisa pushed a commit that referenced this pull request Sep 28, 2018

Merge pull request #1 from ubsan/ptr-docs

c44e88c

fix typos

nagisa pushed a commit that referenced this pull request Nov 3, 2018

Merge pull request #1 from rust-lang/master

fa76d42

sync fork with upstream (master)

nagisa pushed a commit that referenced this pull request Jan 18, 2019

Merge pull request #1 from Centril/redo-vec-set_len-docs-adjust

986e49d

Explain safety for `vec.set_len(0)`

nagisa pushed a commit that referenced this pull request May 2, 2019

Auto merge of rust-lang#59540 - Zoxc:the-arena-2, r=michaelwoerister

cfeb917

Use arenas to avoid Lrc in queries #1 Based on rust-lang#59536.

nagisa pushed a commit that referenced this pull request Jun 26, 2020

Merge pull request #1 from rust-lang/master

a6127e3

update from upstream

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various fixes and improvements to the MIR Dataflow framework #1

Various fixes and improvements to the MIR Dataflow framework #1

gereeter commented May 16, 2016

nagisa May 16, 2016

gereeter May 16, 2016

nagisa commented May 16, 2016

gereeter commented Jun 6, 2016

nagisa commented Jun 6, 2016

nagisa Jun 6, 2016

gereeter Jun 6, 2016 •

edited

Loading

gereeter Jun 6, 2016

gereeter commented Jun 6, 2016

Various fixes and improvements to the MIR Dataflow framework #1

Are you sure you want to change the base?

Various fixes and improvements to the MIR Dataflow framework #1

Conversation

gereeter commented May 16, 2016

nagisa May 16, 2016

Choose a reason for hiding this comment

gereeter May 16, 2016

Choose a reason for hiding this comment

nagisa commented May 16, 2016

gereeter commented Jun 6, 2016

nagisa commented Jun 6, 2016

nagisa Jun 6, 2016

Choose a reason for hiding this comment

gereeter Jun 6, 2016 • edited Loading

Choose a reason for hiding this comment

gereeter Jun 6, 2016

Choose a reason for hiding this comment

gereeter commented Jun 6, 2016

gereeter Jun 6, 2016 •

edited

Loading