Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes and improvements to the MIR Dataflow framework #1

Open
wants to merge 14 commits into
base: mir-dflow
Choose a base branch
from

Conversation

gereeter
Copy link

Since you mentioned that you were too busy to work much on your PR, I thought I would do some work myself.


/// The transfer function which given a terminator and a fact produces a fact for each
/// successor of the terminator.
///
/// Corectness precondtition:
/// * The list of facts produced should only contain the facts for blocks which are successors
/// of the terminator being transfered.
fn term(&mir::Terminator<'tcx>, L) -> Self::TerminatorOut;
fn term(&self, &mir::Terminator<'tcx>, Self::Lattice) -> Vec<Self::Lattice>;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the trait not very reusable for backward analysis, where terminators only have a single edge regardless of how many edges the terminator has in the forward direction. That is the primary motivation for TerminatorOut associated type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I'm not sure that the same trait should be reused for the forward and backwards cases. Regardless, even if the two directions share a trait, I'd prefer a clearer way of denoting which direction a pass goes than using Vec<Lattice> for forward and Lattice for backwards. Transfer could have another argument (Direction) which itself has an associated type denoting what output term should have.

I got rid of the associated TerminatorOut because it felt clumsy, it cause recursive trait bounds when I moved in the associated lattice type, and because it wasn't necessary yet. When backwards transformation gets implemented, I'd be happy for this to be generalized.

@nagisa
Copy link
Owner

nagisa commented May 16, 2016

I will review and see what I can merge from this thing in more detail (likely, much) later.

Thanks for the PR!

@gereeter
Copy link
Author

gereeter commented Jun 6, 2016

I fixed a bunch of correctness bugs, particularly in the alias side of things. However, I'm still not getting a successful bootstrap including AliasRewrite, and anyway I believe it could benefit from very different data structures, so I think it could use a clean rewrite.

Additionally, enough MIR changes have gone in that I'm planning to rebase my branch on master. Would it be easier if I just made my own rust-lang/rust PR, especially if I got it to bootstrap?

@nagisa
Copy link
Owner

nagisa commented Jun 6, 2016

Sure, but I still have a bunch of things I want to do with it, which is why the original PR was [WIP].

fn bottom() -> Self { unimplemented!() }
#[derive(Debug, Clone)]
enum AcsLattice<'tcx> {
Bottom,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bottom element in this lattice is an empty HashMap. It is fine to have it this way, because HashMap::new() does not allocate, to my knowledge.

Similarly, I’m confused as to why the Top was removed from Either. I do not see how you could merge two different constants together and produce anything other than a Top.

/me shrugs

Copy link
Author

@gereeter gereeter Jun 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the lattice so that HashMap::new() is actually the top element of the lattice - I treat empty elements as top instead of bottom. You'll notice that join now does intersection instead of union. This is primarily because when we encounter an unknown function call, we need to mark everything as Top, because the function call could change anything. This situation can be improved with alias analysis and information about what the function might do, but the default needs to be Top. Similarly, when we enter the function, everything needs to be Top. Consider the following case:

fn foo(mut x: u32) {
    if /* random unoptimizable condition */ {
        x = 5;
    }
    println!("{}", x);
}

Previously, since the value stored for x was bottom, we would merge the fact bottom and the fact x = 5 at the end of the if statement, concluding that x was always 5. We would optimize to:

fn foo(mut x: u32) {
    if /* random unoptimizable condition */ {
        x = 5;
    }
    println!("{}", 5);
}

which is just wrong.

Note that, in terms of inspiration, while the original paper used a union-based lattice for constant propogation, GHC actually uses an intersection based lattice (see here).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I realized after I wrote that code that WBottom existed and that it would have been better to use that. However, I'm still planning to remove the requirement that lattices have bottom, again inspired by https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup, and so I didn't bother to switch to a cleaner version.

@gereeter
Copy link
Author

gereeter commented Jun 6, 2016

Yeah, I figured so - it just seemed like a simpler way of keeping track of changes.

nagisa pushed a commit that referenced this pull request Aug 9, 2016
rustc_trans: don't Assert(Overflow(Neg)) when overflow checks are off.

Generic functions using `Neg` on primitive types would panic even in release mode, with MIR trans.
The solution is a bit hacky, as I'm checking the message, since there's no dedicated `CheckedUnOp`.

Blocks Servo rustup ([failure #1](http://build.servo.org/builders/linux-rel/builds/2477/steps/test_3/logs/stdio), [failure rust-lang#2](http://build.servo.org/builders/mac-rel-css/builds/2364/steps/test/logs/stdio)) - this should be the last hurdle, it affects only one test.
nagisa pushed a commit that referenced this pull request Jan 26, 2017
For a given file

```rust
trait A { fn foo(&self) {} }
trait B : A { fn foo(&self) {} }

fn bar<T: B>(a: &T) {
  a.foo()
}
```

provide the following output

```
error[E0034]: multiple applicable items in scope
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^ multiple `foo` found
  |
note: candidate #1 is defined in the trait `A`
 --> file.rs:2:11
  |
2 | trait A { fn foo(&self, a: usize) {} }
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to use it here write `A::foo(&a, 1)` instead
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^
note: candidate rust-lang#2 is defined in the trait `B`
 --> file.rs:3:15
  |
3 | trait B : A { fn foo(&self, a: usize) {} }
  |               ^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to use it here write `B::foo(&a, 1)` instead
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^
```
nagisa pushed a commit that referenced this pull request Jan 26, 2017
E0034: provide disambiguated syntax for candidates

For a given file

```rust
trait A { fn foo(&self) {} }
trait B : A { fn foo(&self) {} }

fn bar<T: B>(a: &T) {
  a.foo()
}
```

provide the following output

```
error[E0034]: multiple applicable items in scope
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^ multiple `foo` found
  |
note: candidate #1 is defined in the trait `A`
 --> file.rs:2:11
  |
2 | trait A { fn foo(&self, a: usize) {} }
  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to use it here write `A::foo(&a, 1)` instead
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^
note: candidate rust-lang#2 is defined in the trait `B`
 --> file.rs:3:15
  |
3 | trait B : A { fn foo(&self, a: usize) {} }
  |               ^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to use it here write `B::foo(&a, 1)` instead
 --> file.rs:6:5
  |
6 |   a.foo(1)
  |     ^^^
```

Fix rust-lang#37767.
nagisa pushed a commit that referenced this pull request Feb 10, 2017
LeakSanitizer, ThreadSanitizer, AddressSanitizer and MemorySanitizer support

```
$ cargo new --bin leak && cd $_

$ edit Cargo.toml && tail -n3 $_
```

``` toml
[profile.dev]
opt-level = 1
```

```
$ edit src/main.rs && cat $_
```

``` rust
use std::mem;

fn main() {
    let xs = vec![0, 1, 2, 3];
    mem::forget(xs);
}
```

```
$ RUSTFLAGS="-Z sanitizer=leak" cargo run --target x86_64-unknown-linux-gnu; echo $?
    Finished dev [optimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/leak`

=================================================================
==10848==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x557c3488db1f in __interceptor_malloc /shared/rust/checkouts/lsan/src/compiler-rt/lib/lsan/lsan_interceptors.cc:55
    #1 0x557c34888aaa in alloc::heap::exchange_malloc::h68f3f8b376a0da42 /shared/rust/checkouts/lsan/src/liballoc/heap.rs:138
    rust-lang#2 0x557c34888afc in leak::main::hc56ab767de6d653a $PWD/src/main.rs:4
    rust-lang#3 0x557c348c0806 in __rust_maybe_catch_panic ($PWD/target/debug/leak+0x3d806)

SUMMARY: LeakSanitizer: 16 byte(s) leaked in 1 allocation(s).
23
```

```
$ cargo new --bin racy && cd $_

$ edit src/main.rs && cat $_
```

``` rust
use std::thread;

static mut ANSWER: i32 = 0;

fn main() {
    let t1 = thread::spawn(|| unsafe { ANSWER = 42 });
    unsafe {
        ANSWER = 24;
    }
    t1.join().ok();
}
```

```
$ RUSTFLAGS="-Z sanitizer=thread" cargo run --target x86_64-unknown-linux-gnu; echo $?
==================
WARNING: ThreadSanitizer: data race (pid=12019)
  Write of size 4 at 0x562105989bb4 by thread T1:
    #0 racy::main::_$u7b$$u7b$closure$u7d$$u7d$::hbe13ea9e8ac73f7e $PWD/src/main.rs:6 (racy+0x000000010e3f)
    #1 _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h2e466a92accacc78 /shared/rust/checkouts/lsan/src/libstd/panic.rs:296 (racy+0x000000010cc5)
    rust-lang#2 std::panicking::try::do_call::h7f4d2b38069e4042 /shared/rust/checkouts/lsan/src/libstd/panicking.rs:460 (racy+0x00000000c8f2)
    rust-lang#3 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56)
    rust-lang#4 std::panic::catch_unwind::h31ca45621ad66d5a /shared/rust/checkouts/lsan/src/libstd/panic.rs:361 (racy+0x00000000b517)
    rust-lang#5 std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::hccfc37175dea0b01 /shared/rust/checkouts/lsan/src/libstd/thread/mod.rs:357 (racy+0x00000000c226)
    rust-lang#6 _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hd880bbf91561e033 /shared/rust/checkouts/lsan/src/liballoc/boxed.rs:605 (racy+0x00000000f27e)
    rust-lang#7 std::sys::imp::thread::Thread::new::thread_start::hebdfc4b3d17afc85 <null> (racy+0x0000000abd40)

  Previous write of size 4 at 0x562105989bb4 by main thread:
    #0 racy::main::h23e6e5ca46d085c3 $PWD/src/main.rs:8 (racy+0x000000010d7c)
    #1 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56)
    rust-lang#2 __libc_start_main <null> (libc.so.6+0x000000020290)

  Location is global 'racy::ANSWER::h543d2b139f819b19' of size 4 at 0x562105989bb4 (racy+0x0000002f8bb4)

  Thread T1 (tid=12028, running) created by main thread at:
    #0 pthread_create /shared/rust/checkouts/lsan/src/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:902 (racy+0x00000001aedb)
    #1 std::sys::imp::thread::Thread::new::hce44187bf4a36222 <null> (racy+0x0000000ab9ae)
    rust-lang#2 std::thread::spawn::he382608373eb667e /shared/rust/checkouts/lsan/src/libstd/thread/mod.rs:412 (racy+0x00000000b5aa)
    rust-lang#3 racy::main::h23e6e5ca46d085c3 $PWD/src/main.rs:6 (racy+0x000000010d5c)
    rust-lang#4 __rust_maybe_catch_panic <null> (racy+0x0000000b4e56)
    rust-lang#5 __libc_start_main <null> (libc.so.6+0x000000020290)

SUMMARY: ThreadSanitizer: data race $PWD/src/main.rs:6 in racy::main::_$u7b$$u7b$closure$u7d$$u7d$::hbe13ea9e8ac73f7e
==================
ThreadSanitizer: reported 1 warnings
66
```

```
$ cargo new --bin oob && cd $_

$ edit src/main.rs && cat $_
```

``` rust
fn main() {
    let xs = [0, 1, 2, 3];
    let y = unsafe { *xs.as_ptr().offset(4) };
}
```

```
$ RUSTFLAGS="-Z sanitizer=address" cargo run --target x86_64-unknown-linux-gnu; echo $?
=================================================================
==13328==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff29f3ecd0 at pc 0x55802dc6bf7e bp 0x7fff29f3ec90 sp 0x7fff29f3ec88
READ of size 4 at 0x7fff29f3ecd0 thread T0
    #0 0x55802dc6bf7d in oob::main::h0adc7b67e5feb2e7 $PWD/src/main.rs:3
    #1 0x55802dd60426 in __rust_maybe_catch_panic ($PWD/target/debug/oob+0xfe426)
    rust-lang#2 0x55802dd58dd9 in std::rt::lang_start::hb2951fc8a59d62a7 ($PWD/target/debug/oob+0xf6dd9)
    rust-lang#3 0x55802dc6c002 in main ($PWD/target/debug/oob+0xa002)
    rust-lang#4 0x7fad8c3b3290 in __libc_start_main (/usr/lib/libc.so.6+0x20290)
    rust-lang#5 0x55802dc6b719 in _start ($PWD/target/debug/oob+0x9719)

Address 0x7fff29f3ecd0 is located in stack of thread T0 at offset 48 in frame
    #0 0x55802dc6bd5f in oob::main::h0adc7b67e5feb2e7 $PWD/src/main.rs:1

  This frame has 1 object(s):
    [32, 48) 'xs' <== Memory access at offset 48 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow $PWD/src/main.rs:3 in oob::main::h0adc7b67e5feb2e7
Shadow bytes around the buggy address:
  0x1000653dfd40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfd50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfd60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfd70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x1000653dfd90: 00 00 00 00 f1 f1 f1 f1 00 00[f3]f3 00 00 00 00
  0x1000653dfda0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfdb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfdc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfdd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000653dfde0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==13328==ABORTING
1
```

```
$ cargo new --bin uninit && cd $_

$ edit src/main.rs && cat $_
```

``` rust
use std::mem;

fn main() {
    let xs: [u8; 4] = unsafe { mem::uninitialized() };
    let y = xs[0] + xs[1];
}
```

```
$ RUSTFLAGS="-Z sanitizer=memory" cargo run; echo $?
==30198==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x563f4b6867da in uninit::main::hc2731cd4f2ed48f8 $PWD/src/main.rs:5
    #1 0x563f4b7033b6 in __rust_maybe_catch_panic ($PWD/target/debug/uninit+0x873b6)
    rust-lang#2 0x563f4b6fbd69 in std::rt::lang_start::hb2951fc8a59d62a7 ($PWD/target/debug/uninit+0x7fd69)
    rust-lang#3 0x563f4b6868a9 in main ($PWD/target/debug/uninit+0xa8a9)
    rust-lang#4 0x7fe844354290 in __libc_start_main (/usr/lib/libc.so.6+0x20290)
    rust-lang#5 0x563f4b6864f9 in _start ($PWD/target/debug/uninit+0xa4f9)

SUMMARY: MemorySanitizer: use-of-uninitialized-value $PWD/src/main.rs:5 in uninit::main::hc2731cd4f2ed48f8
Exiting
77
```
nagisa pushed a commit that referenced this pull request Feb 28, 2017
nagisa pushed a commit that referenced this pull request Mar 8, 2017
Group "missing variable bind" spans in `or` matches and clarify wording
for the two possible cases: when a variable from the first pattern is
not in any of the subsequent patterns, and when a variable in any of the
other patterns is not in the first one.

Before:

```
error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#2
  --> file.rs:10:23
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                       ^^^^^^^^^^^ pattern doesn't bind `a`

error[E0408]: variable `b` from pattern rust-lang#2 is not bound in pattern #1
  --> file.rs:10:32
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                ^ pattern doesn't bind `b`

error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#3
  --> file.rs:10:37
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                     ^^^^^^^^ pattern doesn't bind `a`

error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#3
  --> file.rs:10:37
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                     ^^^^^^^^ pattern doesn't bind `d`

error[E0408]: variable `c` from pattern rust-lang#3 is not bound in pattern #1
  --> file.rs:10:43
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                           ^ pattern doesn't bind `c`

error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#4
  --> file.rs:10:48
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                                ^^^^^^^^ pattern doesn't bind `d`

error: aborting due to 6 previous errors
```

After:

```
error[E0408]: variable `a` is not bound in all patterns
  --> file.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => {
intln!("{:?}", a); }
   |               -       ^^^^^^^^^^^   ^^^^^^^^         - variable
t in all patterns
   |               |       |             |
   |               |       |             pattern doesn't bind `a`
   |               |       pattern doesn't bind `a`
   |               variable not in all patterns

error[E0408]: variable `d` is not bound in all patterns
  --> file.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => {
intln!("{:?}", a); }
   |                  -          -       ^^^^^^^^   ^^^^^^^^ pattern
esn't bind `d`
   |                  |          |       |
   |                  |          |       pattern doesn't bind `d`
   |                  |          variable not in all patterns
   |                  variable not in all patterns

error[E0408]: variable `b` is not bound in all patterns
  --> file.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => {
intln!("{:?}", a); }
   |         ^^^^^^^^^^^            -    ^^^^^^^^   ^^^^^^^^ pattern
esn't bind `b`
   |         |                      |    |
   |         |                      |    pattern doesn't bind `b`
   |         |                      variable not in all patterns
   |         pattern doesn't bind `b`

error[E0408]: variable `c` is not bound in all patterns
  --> file.rs:20:48
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => {
intln!("{:?}", a); }
   |         ^^^^^^^^^^^   ^^^^^^^^^^^         -    ^^^^^^^^ pattern
esn't bind `c`
   |         |             |                   |
   |         |             |                   variable not in all
tterns
   |         |             pattern doesn't bind `c`
   |         pattern doesn't bind `c`

error: aborting due to 4 previous errors
```

* Have only one presentation for binding consistency errors
* Point to same binding in multiple patterns when possible
* Check inconsistent bindings in all arms
* Simplify wording of diagnostic message
* Sort emition and spans of binding errors for deterministic output
nagisa pushed a commit that referenced this pull request Mar 8, 2017
Clean up "pattern doesn't bind x" messages

Group "missing variable bind" spans in `or` matches and clarify wording
for the two possible cases: when a variable from the first pattern is
not in any of the subsequent patterns, and when a variable in any of the
other patterns is not in the first one.

Before:

```rust
error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#2
  --> file.rs:10:23
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                       ^^^^^^^^^^^ pattern doesn't bind `a`

error[E0408]: variable `b` from pattern rust-lang#2 is not bound in pattern #1
  --> file.rs:10:32
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                ^ pattern doesn't bind `b`

error[E0408]: variable `a` from pattern #1 is not bound in pattern rust-lang#3
  --> file.rs:10:37
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                     ^^^^^^^^ pattern doesn't bind `a`

error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#3
  --> file.rs:10:37
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                     ^^^^^^^^ pattern doesn't bind `d`

error[E0408]: variable `c` from pattern rust-lang#3 is not bound in pattern #1
  --> file.rs:10:43
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                           ^ pattern doesn't bind `c`

error[E0408]: variable `d` from pattern #1 is not bound in pattern rust-lang#4
  --> file.rs:10:48
   |
10 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                                                ^^^^^^^^ pattern doesn't bind `d`

error: aborting due to 6 previous errors
```

After:

```rust
error[E0408]: variable `d` is not bound in all patterns
  --> $DIR/issue-39698.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |                  -          -       ^^^^^^^^   ^^^^^^^^ pattern doesn't bind `d`
   |                  |          |       |
   |                  |          |       pattern doesn't bind `d`
   |                  |          variable not in all patterns
   |                  variable not in all patterns

error[E0408]: variable `c` is not bound in all patterns
  --> $DIR/issue-39698.rs:20:48
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |         ^^^^^^^^^^^   ^^^^^^^^^^^         -    ^^^^^^^^ pattern doesn't bind `c`
   |         |             |                   |
   |         |             |                   variable not in all patterns
   |         |             pattern doesn't bind `c`
   |         pattern doesn't bind `c`

error[E0408]: variable `a` is not bound in all patterns
  --> $DIR/issue-39698.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |               -       ^^^^^^^^^^^   ^^^^^^^^         - variable not in all patterns
   |               |       |             |
   |               |       |             pattern doesn't bind `a`
   |               |       pattern doesn't bind `a`
   |               variable not in all patterns

error[E0408]: variable `b` is not bound in all patterns
  --> $DIR/issue-39698.rs:20:37
   |
20 |         T::T1(a, d) | T::T2(d, b) | T::T3(c) | T::T4(a) => { println!("{:?}", a); }
   |         ^^^^^^^^^^^            -    ^^^^^^^^   ^^^^^^^^ pattern doesn't bind `b`
   |         |                      |    |
   |         |                      |    pattern doesn't bind `b`
   |         |                      variable not in all patterns
   |         pattern doesn't bind `b`

error: aborting due to 4 previous errors
```

Fixes rust-lang#39698.
nagisa pushed a commit that referenced this pull request Jun 3, 2017
Without that flag, LLVM generates unaligned memory access instructions, which are not allowed on ARMv5.

For example, the 'hello world' example from `cargo --new` failed with:
```
$ ./hello
Hello, world!
thread 'main' panicked at 'assertion failed: end <= len', src/libcollections/vec.rs:1113
note: Run with `RUST_BACKTRACE=1` for a backtrace.
```

I traced this error back to the following assembler code in `BufWriter::flush_buf`:
```
    6f44:       e28d0018        add     r0, sp, rust-lang#24
[...]
    6f54:       e280b005        add     fp, r0, rust-lang#5
[...]
    7018:       e5cd001c        strb    r0, [sp, rust-lang#28]
    701c:       e1a0082a        lsr     r0, sl, rust-lang#16
    7020:       03a01001        moveq   r1, #1
    7024:       e5cb0002        strb    r0, [fp, rust-lang#2]
    7028:       e1cba0b0        strh    sl, [fp]
```

Note that `fp` points to `sp + 29`, so the three `str*`-instructions should fill up a 32bit - value at `sp + 28`, which is later used as the value `n` in `Ok(n) => written += n`. This doesn't work on ARMv5 as the `strh` can't write to the unaligned contents of `fp`, so the upper bits of `n` won't get cleared, leading to the assertion failure in Vec::drain.

With `+strict-align`, the code works as expected.
nagisa pushed a commit that referenced this pull request Jun 3, 2017
ARMv5 needs +strict-align

Without that flag, LLVM generates unaligned memory access instructions, which are not allowed on ARMv5.

For example, the 'hello world' example from `cargo --new` failed with:
```
$ ./hello
Hello, world!
thread 'main' panicked at 'assertion failed: end <= len', src/libcollections/vec.rs:1113
note: Run with `RUST_BACKTRACE=1` for a backtrace.
```

I traced this error back to the following assembler code in `BufWriter::flush_buf`:
```
    6f44:       e28d0018        add     r0, sp, rust-lang#24
[...]
    6f54:       e280b005        add     fp, r0, rust-lang#5
[...]
    7018:       e5cd001c        strb    r0, [sp, rust-lang#28]
    701c:       e1a0082a        lsr     r0, sl, rust-lang#16
    7020:       03a01001        moveq   r1, #1
    7024:       e5cb0002        strb    r0, [fp, rust-lang#2]
    7028:       e1cba0b0        strh    sl, [fp]
```

Note that `fp` points to `sp + 29`, so the three `str*`-instructions should fill up a 32bit - value at `sp + 28`, which is later used as the value `n` in `Ok(n) => written += n`. This doesn't work on ARMv5 as the `strh` can't write to the unaligned contents of `fp`, so the upper bits of `n` won't get cleared, leading to the assertion failure in Vec::drain.

With `+strict-align`, the code works as expected.
nagisa pushed a commit that referenced this pull request Apr 18, 2018
nagisa pushed a commit that referenced this pull request Apr 18, 2018
Building for x86_64-unknown-linux-musl currently results in an executable lacking debug information for musl libc itself. If you request a backtrace in GDB while control flow is within musl – including sycalls made by musl – the result looks like:

#0  0x0000000000434b46 in __cp_end ()
#1  0x0000000000432dbd in __syscall_cp_c ()
rust-lang#2  0x0000000000000000 in ?? ()

i.e. not very helpful. Adding --enable-debug resolves this, and --enable-optimize re-enables optimisations which default to off given the previous flag.
nagisa pushed a commit that referenced this pull request Apr 18, 2018
Add --enable-debug flag to musl CI build script

Building for x86_64-unknown-linux-musl currently results in an executable lacking debug information for musl libc itself. If you request a backtrace in GDB while control flow is within musl – including sycalls made by musl – the result looks like:

```
#0  0x0000000000434b46 in __cp_end ()
#1  0x0000000000432dbd in __syscall_cp_c ()
rust-lang#2  0x0000000000000000 in ?? ()
```

i.e. not very helpful. Adding --enable-debug resolves this, and --enable-optimize re-enables optimisations which default to off given the previous flag.
nagisa pushed a commit that referenced this pull request Jun 2, 2018
There is a hot path through `opt_normalize_projection_type`:
- `try_start` does a cache lookup (#1).
- The result is a `NormalizedTy`.
- There are no unresolved type vars, so we call `complete`.
- `complete` does *another* cache lookup (rust-lang#2), then calls
  `SnapshotMap::insert`.
- `insert` does *another* cache lookup (rust-lang#3), inserting the same value
  that's already in the cache.

This patch optimizes this hot path by introducing `complete_normalized`,
for use when the value is known in advance to be a `NormalizedTy`. It
always avoids lookup rust-lang#2. Furthermore, if the `NormalizedTy`'s
obligations are empty (the common case), we know that lookup rust-lang#3 would be
a no-op, so we avoid it, while inserting a Noop into the `SnapshotMap`'s
undo log.
nagisa pushed a commit that referenced this pull request Jun 2, 2018
When encountering an unexisting method for a given trait where an
associated function has the same name, suggest using the appropriate
syntax, instead of using `help` text.

When only one candidate is found, do not call it "candidate #1", just
call it "the candidate".
nagisa pushed a commit that referenced this pull request Jun 2, 2018
Tweak output on E0599 for assoc fn used as method

 - Use suggestion instead of `help` when possible
 - Add primary span label
 - Remove incorrect `help` suggestion using incorrect syntax
 - Do not refer to only one possible candidate as `candidate #1`, refer to it as `the candidate`
nagisa pushed a commit that referenced this pull request Jul 2, 2018
Suggestion for 'static impl Trait return

When encountering a named or anonymous sup requirement (for example,
`&'a self`) and a `'static` impl Trait return type, suggest adding the
`'_` lifetime constraing to the return type.

Fix rust-lang#43719, rust-lang#51282.

```
error: cannot infer an appropriate lifetime
  --> $DIR/static-return-lifetime-infered.rs:17:16
   |
LL |     fn iter_values_anon(&self) -> impl Iterator<Item=u32> {
   |                                   ----------------------- this return type evaluates to the `'static` lifetime...
LL |         self.x.iter().map(|a| a.0)
   |         ------ ^^^^
   |         |
   |         ...but this borrow...
   |
note: ...can't outlive the anonymous lifetime #1 defined on the method body at 16:5
  --> $DIR/static-return-lifetime-infered.rs:16:5
   |
LL | /     fn iter_values_anon(&self) -> impl Iterator<Item=u32> {
LL | |         self.x.iter().map(|a| a.0)
LL | |     }
   | |_____^
help: you can add a constraint to the return type to make it last less than `'static` and match the anonymous lifetime #1 defined on the method body at 16:5
   |
LL |     fn iter_values_anon(&self) -> impl Iterator<Item=u32> + '_ {
   |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
nagisa pushed a commit that referenced this pull request Aug 12, 2018
…-box, r=eddyb

[NLL] Dangly paths for box

Special-case `Box` in `rustc_mir::borrow_check`.

Since we know dropping a box will not access any `&mut` or `&` references, it is safe to model its destructor as only touching the contents *owned* by the box.

----

There are three main things going on here:

1. The first main thing, this PR is fixing a bug in NLL where `rustc` previously would issue a diagnostic error in a case like this:
```rust
fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
```

such code was accepted by the AST-borrowck in the past, but NLL was rejecting it with the following message ([playground](https://play.rust-lang.org/?gist=13c5560f73bfb16d6dab3ceaad44c0f8&version=nightly&mode=release&edition=2015))
```
error[E0597]: `**x` does not live long enough
 --> src/main.rs:3:40
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  |                                        ^^^^^^^^ - `**x` dropped here while still borrowed
  |                                        |
  |                                        borrowed value does not live long enough
  |
note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 3:1...
 --> src/main.rs:3:1
  |
3 | fn foo(x: Box<&mut i32>) -> &mut i32 { &mut **x }
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error
```

2. The second main thing: The reason such code was previously rejected was because NLL (MIR-borrowck) incorporates a fix for issue rust-lang#31567, where it models a destructor's execution as potentially accessing any borrows held by the thing being destructed. The tests with `Scribble` model this, showing that the compiler now catches such unsoundness.

However, that fix for issue rust-lang#31567 is too strong, in that NLL (MIR-borrowck) includes `Box` as one of the types with a destructor that potentially accesses any borrows held by the box. This thus was the cause of the main remaining discrepancy between AST-borrowck and MIR-borrowck, as documented in issue rust-lang#45696, specifically in [the last example of this comment](rust-lang#45696 (comment)), which I have adapted into the `fn foo` shown above.

We did close issue rust-lang#45696 back in December of 2017, but AFAICT that example was not fixed by PR rust-lang#46268. (And we did not include a test, etc etc.)

This PR fixes that case, by trying to model the so-called `DerefPure` semantics of `Box<T>` when we traverse the type of the input to `visit_terminator_drop`.

3. The third main thing is that during a review of the first draft of this PR, @matthewjasper pointed out that the new traversal of `Box<T>` could cause the compiler to infinite loop. I have adjusted the PR to avoid this (by tracking what types we have previously seen), and added a much needed test of this somewhat odd scenario. (Its an odd scenario because the particular case only arises for things like `struct A(Box<A>);`, something which cannot be constructed in practice.)

Fix rust-lang#45696.
nagisa pushed a commit that referenced this pull request Sep 28, 2018
nagisa pushed a commit that referenced this pull request Nov 3, 2018
sync fork with upstream (master)
nagisa pushed a commit that referenced this pull request Jan 18, 2019
nagisa pushed a commit that referenced this pull request May 2, 2019
nagisa pushed a commit that referenced this pull request Jun 26, 2020
nagisa added a commit that referenced this pull request Aug 20, 2020
`stride == 1` case can be computed more efficiently through `-p (mod
a)`. That, then translates to a nice and short sequence of LLVM
instructions:

    %address = ptrtoint i8* %p to i64
    %negptr = sub i64 0, %address
    %offset = and i64 %negptr, %a_minus_one

And produces pretty much ideal code-gen when this function is used in
isolation.

Typical use of this function will, however, involve use of
the result to offset a pointer, i.e.

    %aligned = getelementptr inbounds i8, i8* %p, i64 %offset

This still looks very good, but LLVM does not really translate that to
what would be considered ideal machine code (on any target). For example
that's the codegen we obtain for an unknown alignment:

    ; x86_64
    dec     rsi
    mov     rax, rdi
    neg     rax
    and     rax, rsi
    add     rax, rdi

In particular negating a pointer is not something that’s going to be
optimised for in the design of CISC architectures like x86_64. They
are much better at offsetting pointers. And so we’d love to utilize this
ability and produce code that's more like this:

    ; x86_64
    lea     rax, [rsi + rdi - 1]
    neg     rsi
    and     rax, rsi

To achieve this we need to give LLVM an opportunity to apply its
various peep-hole optimisations that it does during DAG selection. In
particular, the `and` instruction appears to be a major inhibitor here.
We cannot, sadly, get rid of this load-bearing operation, but we can
reorder operations such that LLVM has more to work with around this
instruction.

One such ordering is proposed in rust-lang#75579 and results in LLVM IR that
looks broadly like this:

    ; using add enables `lea` and similar CISCisms
    %offset_ptr = add i64 %address, %a_minus_one
    %mask = sub i64 0, %a
    %masked = and i64 %offset_ptr, %mask
    ; can be folded with `gepi` that may follow
    %offset = sub i64 %masked, %address

…and generates the intended x86_64 machine code. One might also wonder
how the increased amount of code would impact a RISC target. Turns out
not much:

    ; aarch64 previous                 ; aarch64 new
    sub     x8, x1, #1                 add     x8, x1, x0
    neg     x9, x0                     sub     x8, x8, #1
    and     x8, x9, x8                 neg     x9, x1
    add     x0, x0, x8                 and     x0, x8, x9

    (and similarly for ppc, sparc, mips, riscv, etc)

The only target that seems to do worse is… wasm32.

Onto actual measurements – the best way to evaluate snippets like these
is to use llvm-mca. Much like Aarch64 assembly would allow to suspect,
there isn’t any performance difference to be found. Both snippets
execute in same number of cycles for the CPUs I tried. On x86_64,
we get throughput improvement of >50%, however!
nagisa pushed a commit that referenced this pull request Feb 13, 2021
HWAddressSanitizer support

#  Motivation
Compared to regular ASan, HWASan has a [smaller overhead](https://source.android.com/devices/tech/debug/hwasan). The difference in practice is that HWASan'ed code is more usable, e.g. Android device compiled with HWASan can be used as a daily driver.

# Example
```
fn main() {
    let xs = vec![0, 1, 2, 3];
    let _y = unsafe { *xs.as_ptr().offset(4) };
}
```
```
==223==ERROR: HWAddressSanitizer: tag-mismatch on address 0xefdeffff0050 at pc 0xaaaad00b3468
READ of size 4 at 0xefdeffff0050 tags: e5/00 (ptr/mem) in thread T0
    #0 0xaaaad00b3464  (/root/main+0x53464)
    #1 0xaaaad00b39b4  (/root/main+0x539b4)
    rust-lang#2 0xaaaad00b3dd0  (/root/main+0x53dd0)
    rust-lang#3 0xaaaad00b61dc  (/root/main+0x561dc)
    rust-lang#4 0xaaaad00c0574  (/root/main+0x60574)
    rust-lang#5 0xaaaad00b6290  (/root/main+0x56290)
    rust-lang#6 0xaaaad00b6170  (/root/main+0x56170)
    rust-lang#7 0xaaaad00b3578  (/root/main+0x53578)
    rust-lang#8 0xffff81345e70  (/lib64/libc.so.6+0x20e70)
    rust-lang#9 0xaaaad0096310  (/root/main+0x36310)

[0xefdeffff0040,0xefdeffff0060) is a small allocated heap chunk; size: 32 offset: 16
0xefdeffff0050 is located 0 bytes to the right of 16-byte region [0xefdeffff0040,0xefdeffff0050)
allocated here:
    #0 0xaaaad009bcdc  (/root/main+0x3bcdc)
    #1 0xaaaad00b1eb0  (/root/main+0x51eb0)
    rust-lang#2 0xaaaad00b20d4  (/root/main+0x520d4)
    rust-lang#3 0xaaaad00b2800  (/root/main+0x52800)
    rust-lang#4 0xaaaad00b1cf4  (/root/main+0x51cf4)
    rust-lang#5 0xaaaad00b33d4  (/root/main+0x533d4)
    rust-lang#6 0xaaaad00b39b4  (/root/main+0x539b4)
    rust-lang#7 0xaaaad00b61dc  (/root/main+0x561dc)
    rust-lang#8 0xaaaad00b3578  (/root/main+0x53578)
    rust-lang#9 0xaaaad0096310  (/root/main+0x36310)

Thread: T0 0xeffe00002000 stack: [0xffffc0590000,0xffffc0d90000) sz: 8388608 tls: [0xffff81521020,0xffff815217d0)
Memory tags around the buggy address (one tag corresponds to 16 bytes):
  0xfefcefffef80: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffef90: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefa0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefb0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefc0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefd0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffefe0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefcefffeff0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
=>0xfefceffff000: a2  a2  05  00  e5 [00] 00  00  00  00  00  00  00  00  00  00
  0xfefceffff010: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff020: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff030: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff040: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff050: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff060: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff070: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
  0xfefceffff080: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00
Tags for short granules around the buggy address (one tag corresponds to 16 bytes):
  0xfefcefffeff0: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
=>0xfefceffff000: ..  ..  c5  ..  .. [..] ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
  0xfefceffff010: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..
See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags
Registers where the failure occurred (pc 0xaaaad00b3468):
    x0  e500efdeffff0050  x1  0000000000000004  x2  0000ffffc0d8f5a0  x3  0200efff00000000
    x4  0000ffffc0d8f4c0  x5  000000000000004f  x6  00000ffffc0d8f36  x7  0000efff00000000
    x8  e500efdeffff0050  x9  0200efff00000000  x10 0000000000000000  x11 0200efff00000000
    x12 0200effe000006b0  x13 0200effe000006b0  x14 0000000000000008  x15 00000000c00000cf
    x16 0000aaaad00a0afc  x17 0000000000000003  x18 0000000000000001  x19 0000ffffc0d8f718
    x20 ba00ffffc0d8f7a0  x21 0000aaaad00962e0  x22 0000000000000000  x23 0000000000000000
    x24 0000000000000000  x25 0000000000000000  x26 0000000000000000  x27 0000000000000000
    x28 0000000000000000  x29 0000ffffc0d8f650  x30 0000aaaad00b3468
```

# Comments/Caveats
* HWASan is only supported on arm64.
* I'm not sure if I should add a feature gate or piggyback on the existing one for sanitizers.
* HWASan requires `-C target-feature=+tagged-globals`. That flag should probably be set transparently to the user. Not sure how to go about that.

# TODO
* Need more tests.
* Update documentation.
* Fix symbolization.
* Integrate with CI
nagisa pushed a commit that referenced this pull request Mar 10, 2021
bypass auto_da_alloc for metadata files

This saves about 0.7% when rerunning the UI test suite. I.e. when the metadata files exist and will be overwritten. No improvements expected for a clean build. So it might show up in incr-patched perf results.
```
regular rename:

Benchmark #1: touch src/tools/compiletest/src/main.rs ; RUSTC_WRAPPER="" schedtool -B -e ./x.py test src/test/ui
  Time (mean ± σ):     47.305 s ±  0.170 s    [User: 1631.540 s, System: 412.648 s]
  Range (min … max):   47.125 s … 47.856 s    20 runs

non-durable rename:

Benchmark #1: touch src/tools/compiletest/src/main.rs ; RUSTC_WRAPPER="" schedtool -B -e ./x.py test src/test/ui
  Time (mean ± σ):     46.930 s ±  0.064 s    [User: 1634.344 s, System: 396.038 s]
  Range (min … max):   46.759 s … 47.043 s    20 runs
```

There are more places that trigger auto_da_alloc behavior by overwriting existing files with O_TRUNC, but those are much harder to locate because `O_TRUNC` is set on `open()` but the writeback is triggered on `close()`. The latter is the part which shows up in profiles.
nagisa pushed a commit that referenced this pull request Aug 27, 2021
Otherwise, we can get into a situation where you have
a subtype obligation `#1 <: rust-lang#2` pending, #1 is constrained
by `check_casts`, but rust-lang#2` is unaffected.

Co-authored-by: Niko Matsakis <niko@alum.mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants