Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic during unit tests (coerce_match) #20055

Closed
frewsxcv opened this issue Dec 20, 2014 · 26 comments · Fixed by #21692
Closed

Panic during unit tests (coerce_match) #20055

frewsxcv opened this issue Dec 20, 2014 · 26 comments · Fixed by #21692
Labels
I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics.

Comments

@frewsxcv
Copy link
Member

Currently on bd90b93

Running on OSX

cc'ing @kballard and @barosl since I saw both of them also had this issue in #rust-internals on IRC

...
test [run-pass] run-pass/yield2.rs ... ok
test [run-pass] run-pass/zero-size-type-destructors.rs ... ok
test [run-pass] run-pass/yield1.rs ... ok
test [run-pass] run-pass/vector-sort-panic-safe.rs ... ok
test [run-pass] run-pass/yield.rs ... ok

using metrics ratchet: tmp/check-stage2-T-x86_64-apple-darwin-H-x86_64-apple-darwin-rpass-metrics.json
result of ratchet: 0 metrics added, 0 removed, 0 improved, 0 regressed, 0 noise
updated ratchet file

failures:

---- [run-pass] run-pass/coerce-match.rs stdout ----

error: test run failed!
status: signal: 4
command: x86_64-apple-darwin/test/run-pass/coerce-match.stage2-x86_64-apple-darwin
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[run-pass] run-pass/coerce-match.rs' panicked at 'explicit panic', /Users/coreyf/Development/rust/src/compiletest/runtest.rs:1487



failures:
    [run-pass] run-pass/coerce-match.rs

test result: FAILED. 1745 passed; 1 failed; 27 ignored; 0 measured

thread '<main>' panicked at 'Some tests failed', /Users/coreyf/Development/rust/src/compiletest/compiletest.rs:267
make: *** [tmp/check-stage2-T-x86_64-apple-darwin-H-x86_64-apple-darwin-rpass.ok] Error 101
@barosl
Copy link
Contributor

barosl commented Dec 20, 2014

I should also note that the test code SIGILLs only when being compiled using an optimization flag. Without the flag, it runs well.

pub fn main() {
    let _: Box<[int]> = if true { box [1i, 2, 3] } else { box [1i] };

    let _: Box<[int]> = match true { true => box [1i, 2, 3], false => box [1i] };

    // Check we don't get over-keen at propagating coercions in the case of casts.
    let x = if true { 42 } else { 42u8 } as u16;
    let x = match true { true => 42, false => 42u8 } as u16;
}

@frewsxcv
Copy link
Member Author

@jroesch is also having this issue. He says that tests stopped passing for him sometime after the night of December 16th

@jroesch
Copy link
Member

jroesch commented Dec 20, 2014

Yeah both of my open PR's were exhibiting this on Mac OS X Yosemite. You can find them at #20002 and #20067 for reference.

@lilyball
Copy link
Contributor

Here's the backtrace from a crash log:

Exception Type:        EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:       0x0000000000000001, 0x0000000000000000

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   librustrt-4e7c5e5c.dylib        0x0000000103941f6b stack_overflow::imp::signal_handler::term::h4e1f7a7875fabf2cTLa + 59
1   librustrt-4e7c5e5c.dylib        0x0000000103941ee5 stack_overflow::imp::signal_handler::h49baa5350251116fALa + 101
2   libsystem_platform.dylib        0x00007fff91953f1a _sigtramp + 26
3   librustrt-4e7c5e5c.dylib        0x00000001039664f8 je_extent_tree_ad_remove + 216
4   librustrt-4e7c5e5c.dylib        0x0000000103967224 je_huge_dalloc + 68
5   librustrt-4e7c5e5c.dylib        0x0000000103971c16 je_sdallocx + 534
6   libstd-4e7c5e5c.dylib           0x00000001036ee4f4 thunk::F.Invoke$LT$A$C$$u{20}R$GT$::invoke::h14050218259616233623 + 52
7   libstd-4e7c5e5c.dylib           0x00000001036ee5d4 rt::start::closure.32341 + 164
8   librustrt-4e7c5e5c.dylib        0x00000001039aaafc rust_try_inner + 12
9   librustrt-4e7c5e5c.dylib        0x00000001039aaae6 rust_try + 6
10  librustrt-4e7c5e5c.dylib        0x00000001039458c7 unwind::try::hc34b1ff6cb3d2cbdquc + 71
11  librustrt-4e7c5e5c.dylib        0x000000010394566c task::Task::run::h5eecf441ff21ebbepLb + 124
12  libstd-4e7c5e5c.dylib           0x00000001036ee3af rt::start::h98e85f0930041adcIZx + 511
13  libstd-4e7c5e5c.dylib           0x00000001036ee19d rt::lang_start::hdf0f81f47bc259ddZYx + 109
14  libdyld.dylib                   0x00007fff93b1a5c9 start + 1

LLDB gives the following stack trace:

* thread #1: tid = 0xc417fe, 0x00000001000c3418 libstd-4e7c5e5c.dylib`je_extent_tree_ad_remove(rbtree=0x000000010039cce0, node=0x0000000000000000) + 216 at extent.c:38, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x18)
    frame #0: 0x00000001000c3418 libstd-4e7c5e5c.dylib`je_extent_tree_ad_remove(rbtree=0x000000010039cce0, node=0x0000000000000000) + 216 at extent.c:38
  * frame #1: 0x00000001000c4144 libstd-4e7c5e5c.dylib`je_huge_dalloc(ptr=<unavailable>) + 68 at huge.c:273
    frame #2: 0x00000001000ceb36 libstd-4e7c5e5c.dylib`je_sdallocx [inlined] je_isdalloct(ptr=<unavailable>) + 534 at jemalloc_internal.h:786
    frame #3: 0x00000001000ceb2b libstd-4e7c5e5c.dylib`je_sdallocx [inlined] je_isqalloc(ptr=<unavailable>) at jemalloc_internal.h:813
    frame #4: 0x00000001000ceb2b libstd-4e7c5e5c.dylib`je_sdallocx [inlined] isfree(ptr=<unavailable>) + 215 at jemalloc.c:1257
    frame #5: 0x00000001000cea54 libstd-4e7c5e5c.dylib`je_sdallocx(ptr=<unavailable>, size=<unavailable>, flags=<unavailable>) + 308 at jemalloc.c:1896
    frame #6: 0x00000001001053c9 libstd-4e7c5e5c.dylib`rust_try_inner + 9
    frame #7: 0x00000001001053b6 libstd-4e7c5e5c.dylib`rust_try + 6
    frame #8: 0x000000010009d43d libstd-4e7c5e5c.dylib`rt::lang_start::hd39f943feeb73f34NCz + 653
    frame #9: 0x00007fff93b1a5c9 libdyld.dylib`start + 1

frame #0 points at a macro to generate red-black tree functions, though as you can see node is NULL, so perhaps that's the cause.

@lilyball
Copy link
Contributor

I'm going to attempt a bisect, though it'll take a while.

@lilyball
Copy link
Contributor

According to the bisect, the bad commit is 46eb724.

@lilyball
Copy link
Contributor

Which was introduced in #19769 /cc @nick29581

@lilyball
Copy link
Contributor

As it turns out, #19769 is what introduced the run-pass/coerce-match.rs test in the first place.

@nrc nrc added the I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. label Dec 22, 2014
@eddyb
Copy link
Member

eddyb commented Dec 23, 2014

@kballard can you check whether #20083 affects this issue in any way?

@lilyball
Copy link
Contributor

@eddyb Sadly, run-pass/coerce-match.rs still fails on make check after applying #20083.

@crhino
Copy link
Contributor

crhino commented Dec 25, 2014

I was running make check CFG_VALGRIND="/usr/bin/valgrind" and hit an error on this unit test as well, looks to be related. I'm on 3.13.0-43-generic GNU/Linux. Everything passes fine when not running under Valgrind.

This is running with the merge of #20083., commit d10642e.

failures:

---- [run-pass] run-pass/coerce-match.rs stdout ----

error: test run failed!
status: signal: 11
command: /usr/bin/valgrind x86_64-unknown-linux-gnu/test/run-pass/coerce-match.stage2-x86_64-unknown-linux-gnu
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------
==22243== Memcheck, a memory error detector
==22243== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==22243== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==22243== Command: x86_64-unknown-linux-gnu/test/run-pass/coerce-match.stage2-x86_64-unknown-linux-gnu
==22243== 
==22243== Warning: client switching stacks?  SP change: 0xffeffee60 --> 0xffedf63d8
==22243==          to suppress, use: --max-stackframe=2132616 or greater
==22243== Syscall param read(buf) points to unaddressable byte(s)
==22243==    at 0x40194F7: read (syscall-template.S:81)
==22243==    by 0x4005E0C: open_verify.constprop.6 (dl-load.c:2099)
==22243==    by 0x40061CE: open_path (dl-load.c:2216)
==22243==    by 0x4008F19: _dl_map_object (dl-load.c:2447)
==22243==    by 0x400D601: openaux (dl-deps.c:63)
==22243==    by 0x400FFF3: _dl_catch_error (dl-error.c:187)
==22243==    by 0x400DD04: _dl_map_object_deps (dl-deps.c:254)
==22243==    by 0x400315C: dl_main (rtld.c:1742)
==22243==    by 0x4017564: _dl_sysdep_start (dl-sysdep.c:249)
==22243==    by 0x4004CF7: _dl_start (rtld.c:332)
==22243==    by 0x40012D7: ??? (in /lib/x86_64-linux-gnu/ld-2.19.so)
==22243==  Address 0xffedf63f0 is on thread 1's stack
==22243== 
==22243== Warning: client switching stacks?  SP change: 0xffedf63e0 --> 0xffeffee88
==22243==          to suppress, use: --max-stackframe=2132648 or greater
==22243== Invalid free() / delete / delete[] / realloc()
==22243==    at 0x4F779D8: je_valgrind_freelike_block (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4FC1D28: rust_try_inner (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4FC1D15: rust_try (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5DBC9: rt::lang_start::he0ccf12406aa1099d1z (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x5489EC4: (below main) (libc-start.c:287)
==22243==  Address 0x6c26000 is 0 bytes inside a block of size 32 free'd
==22243==    at 0x4F779D8: je_valgrind_freelike_block (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5D5B2: rt::args::imp::put::hbd0ed9ed6213e26c8Rz (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5D024: rt::args::init::hcc36baa9a67ccf9dlPz (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5DB52: rt::lang_start::he0ccf12406aa1099d1z (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x5489EC4: (below main) (libc-start.c:287)
==22243== 
==22243== Invalid read of size 8
==22243==    at 0x4F92559: je_arena_dalloc_small (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F76FF1: je_sdallocx (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5D1A2: rt::args::cleanup::hec9e413559adae2cDPz (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5DC9C: rt::lang_start::he0ccf12406aa1099d1z (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x5489EC4: (below main) (libc-start.c:287)
==22243==  Address 0x6be92b8 is not stack'd, malloc'd or (recently) free'd
==22243== 
==22243== Invalid read of size 4
==22243==    at 0x5A3C414: pthread_mutex_lock (pthread_mutex_lock.c:66)
==22243==    by 0x4F92565: je_arena_dalloc_small (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F76FF1: je_sdallocx (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5D1A2: rt::args::cleanup::hec9e413559adae2cDPz (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x4F5DC9C: rt::lang_start::he0ccf12406aa1099d1z (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==22243==    by 0x5489EC4: (below main) (libc-start.c:287)
==22243==  Address 0x10 is not stack'd, malloc'd or (recently) free'd
==22243== 
==21446== 
==21446== Process terminating with default action of signal 11 (SIGSEGV)
==21446==    at 0x549EBB9: raise (raise.c:56)
==21446==    by 0x4F4A704: sys::stack_overflow::imp::signal_handler::h7e9a49daa22611feMkw (in /home/chris/workspace/rust/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4e7c5e5c.so)
==21446==    by 0x549EC2F: ??? (in /lib/x86_64-linux-gnu/libc-2.19.so)
==21446==    by 0x5A3C413: pthread_mutex_lock (pthread_mutex_lock.c:63)
==22243== 
==22243== HEAP SUMMARY:
==22243==     in use at exit: 264 bytes in 6 blocks
==22243==   total heap usage: 18 allocs, 13 frees, 1,672 bytes allocated
==22243== 
==22243== LEAK SUMMARY:
==22243==    definitely lost: 0 bytes in 0 blocks
==22243==    indirectly lost: 0 bytes in 0 blocks
==22243==      possibly lost: 0 bytes in 0 blocks
==22243==    still reachable: 264 bytes in 6 blocks
==22243==         suppressed: 0 bytes in 0 blocks
==22243== Rerun with --leak-check=full to see details of leaked memory
==22243== 
==22243== For counts of detected and suppressed errors, rerun with: -v
==22243== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 2 from 2)

------------------------------------------

thread '[run-pass] run-pass/coerce-match.rs' panicked at 'explicit panic', /home/chris/workspace/rust/src/compiletest/runtest.rs:1487



failures:
    [run-pass] run-pass/coerce-match.rs

test result: FAILED. 1756 passed; 1 failed; 27 ignored; 0 measured

@pnkfelix
Copy link
Member

pnkfelix commented Jan 7, 2015

i also see this on my mac

@nrc nrc changed the title Panic during unit tests Panic during unit tests (coerce_match) Jan 7, 2015
@nrc
Copy link
Member

nrc commented Jan 7, 2015

When this is fixed, can we make sure coerce_match.rs moves to run-pass-valgrind

@jroesch
Copy link
Member

jroesch commented Jan 8, 2015

I am now also getting a second failure:

test [run-pass] run-pass/zero-size-type-destructors.rs ... ok
test [run-pass] run-pass/yield1.rs ... ok
test [run-pass] run-pass/yield.rs ... ok
test [run-pass] run-pass/vector-sort-panic-safe.rs ... ok

failures:

---- [run-pass] run-pass/coerce-match-calls.rs stdout ----

error: test run failed!
status: signal: 4
command: x86_64-apple-darwin/test/run-pass/coerce-match-calls.stage2-x86_64-apple-darwin
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[run-pass] run-pass/coerce-match-calls.rs' panicked at 'explicit panic', /Users/jroesch/Git/rust/src/compiletest/runtest.rs:1453


---- [run-pass] run-pass/coerce-match.rs stdout ----

error: test run failed!
status: signal: 4
command: x86_64-apple-darwin/test/run-pass/coerce-match.stage2-x86_64-apple-darwin
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[run-pass] run-pass/coerce-match.rs' panicked at 'explicit panic', /Users/jroesch/Git/rust/src/compiletest/runtest.rs:1453



failures:
    [run-pass] run-pass/coerce-match-calls.rs
    [run-pass] run-pass/coerce-match.rs

test result: FAILED. 1826 passed; 2 failed; 28 ignored; 0 measured

thread '<main>' panicked at 'Some tests failed', /Users/jroesch/Git/rust/src/compiletest/compiletest.rs:269
make: *** [tmp/check-stage2-T-x86_64-apple-darwin-H-x86_64-apple-darwin-rpass.ok] Error 101

@pnkfelix
Copy link
Member

pnkfelix commented Jan 9, 2015

@jroesch yeah I added that second test when I was investigating the first one (when I was working on stuff related to box <expr> syntax); the test is intended to make it clear that the issue here is not related to the box <expr> syntax.

@jroesch
Copy link
Member

jroesch commented Jan 10, 2015

@pnkfelix thanks for the update. They are just a minor annoyance when running the test suite.

@pnkfelix
Copy link
Member

Here is a reduced test case for input:

use std::boxed::Box;

pub fn main() {
    let _: Box<[i8]> = match true { true => Box::new([1i8, 2, 3]), false => Box::new([1i8]) };
}

I have compared the generated llvm-ir with and without the -O flag to rustc. Here's something funny:

Without -O, we have just a single call to the jemalloc free routine je_sdallocx, followed immediately by a ret void:

  %18 = load i8** %__arg
  store i64 %16, i64* %__arg1
  %19 = load i64* %__arg1
  store i32 %17, i32* %__arg2
  %20 = load i32* %__arg2
  call void @je_sdallocx(i8* %18, i64 %19, i32 %20)
  ret void

with the -O flag to rustc, however, we get output that is quite different, with two distinct calls to ,je_sdallocx in series (tail calls no less), and the second seems to be being fed an undef value? I am not an expert in LLVM IR, but this struck me as strange:

  %x.sroa.0.0..sroa_cast1.i = bitcast i8* %0 to i24*
  store i24 197121, i24* %x.sroa.0.0..sroa_cast1.i, align 1
  tail call void @je_sdallocx(i8* %0, i64 3, i32 0)
  tail call void @je_sdallocx(i8* undef, i64 1, i32 0)
  ret void

@pnkfelix
Copy link
Member

Here is a result of further investigation: I do not know how valid it is to try to plug in different llvm-passes in an attempt to expose a code-generation bug, but it seemed to produce interesting information (in terms of certain combinations seeming to fall victim to the same bug).

Both of these example commands are being fed the reduced test case from the previous comment (named coerce-match-calls2.rs).

# For reference, this command produces an invalid binary,
# but minor tweaks (e.g. removing the sroa pass at the end)
# produces a binary that runs without an error.
CMD_1="rustc /tmp/coerce-match-calls2.rs \
      -C no-prepopulate-passes \
      -C passes=datalayout  \
      -C passes=notti  \
      -C passes=targetlibinfo \
      -C passes=no-aa \
      -C passes=tbaa \
      -C passes=scoped-noalias \
      -C passes=assumption-tracker \
      -C passes=basicaa \
      -C passes=ipsccp \
      -C passes=globalopt \
      -C passes=deadargelim \
      -C passes=instcombine \
      -C passes=simplifycfg \
      -C passes=basiccg \
      -C passes=prune-eh \
      -C passes=inline-cost \
      -C passes=inline \
      -C passes=functionattrs \
      -C passes=sroa \
"

# This is a reduced version of the above; it still produces a binary
# that will error in the midst of jemalloc, due pnkfelix thinks to a
# spuriously injected call to `je_sdallocx` (i.e. `free`)
CMD_2="rustc /tmp/coerce-match-calls2.rs \
      -C no-prepopulate-passes \
      -C passes=datalayout  \
      -C passes=instcombine \
      -C passes=inline \
      -C passes=sroa \
"

@pnkfelix
Copy link
Member

(One can emulate the same effect on a generated .bc file via the llvm opt utility. My main question at this point is if I can extract a standalone test that is independent of rust, including the libraries like jemalloc that I am linking in...)

@pnkfelix
Copy link
Member

Ah, an interesting new detail: in the below revision to the running example, passing --cfg workaround yields a program that works, even when passing -O. This means there may be some way to isolate this to something within trans, rather than focusing our attention on LLVM bugs.

Example with workaround:

use std::boxed::Box;

#[cfg(not(workaround))]
pub fn main() {
    let _: Box<[i8]> = match true {
        true => Box::new([1i8, 2, 3]),
        false => Box::new([1i8]),
    };
}

#[cfg(workaround)]
pub fn main() {
    let _: Box<[i8]> = match true {
        true => Box::new([1i8, 2, 3]),
        false => { Box::new([1i8]) }
    };
}

@pnkfelix
Copy link
Member

I looked into this some more. In particular, I made a #[no_std] test case that uses only the core and libc crates -- it would be straightforward to make it fully standalone, but I did not see a benefit in that. The crucial thing is that this standalone test has its own "built-in" allocator (from a static array of bytes in the code), and it instruments the allocate/free calls with calls to C printf.

Its possible I made a mistake along the way, but there are some pretty funky things being illustrated in this test. In particular, without optimizations, we do not see the erroneous call to free (just like in the examples above), but we do see an extra invocation of the printf("Hello World 2"); call that I embedded into the code. Super strange.

https://gist.github.com/pnkfelix/6dc4fb620c8742100598

(In other words, my claim is that this new test illustrates a bug that is exposed when you don't pass -O; it may be a symptom of the same root bug, or it may be something else. Certainly it leads me to wonder if LLVM is not actually to blame.)

(Update: there was indeed a mistake in my port: I forgot to null terminate my strings, and so the first printf call was printing both strings, which presumably just happened to be laid out next to each other in the generated data segment. Nonetheless, the above gist is still useful, in that it continues to illustrate the bug injected by using -O, and it does so as a standalone #![no_std] program that tells you how the allocate/deallocate calls went wrong.)

@pnkfelix
Copy link
Member

Another gist I have developed seems to show a case where optimizations are disabled and yet we still have a spurious free invocation. This seems like a promising line of inquiry.

https://gist.github.com/pnkfelix/567213f949925ce33697

(Note that main difference between this gist and the one from the previous comment is that I introduced an extra block around the first match arm:

pub fn main() {
  print0("Hello World 1\n\0");
  let _: Box<[i8]> = match true {
    true => { box_1() } // <-- this line is the main difference
    _ => box_2(),
  };
  print0("Hello World 2\n\0");
}

@pnkfelix
Copy link
Member

Yay, I found a different example, a generalization of the original one, where one can observe the erroneous frees occurring even when optimizations are disabled.

https://gist.github.com/pnkfelix/2e2b73092c2fd5b38228

(This is very similar to my previous gists; the main difference is that I have added four separate match arms, each building a differently sized Box<[u8; k]> and then implicitly coercing it to a Box<[u8]>.)

The errors you get with and without -O are different, but they are real. This leads me to suspect that this is a real code gen bug in Rust, not an optimization bug in LLVM. (Although it could be that we are getting some attributes wrong on the structure we feed into LLVM.)

@pnkfelix
Copy link
Member

And here's a variant of that same example; again, this causes an error both with and without optimizations on Mac OS X:

// #![crate_type="lib"]

pub fn foo(box_1: fn () -> Box<[i8; 1]>,
           box_2: fn () -> Box<[i8; 2]>,
           box_3: fn () -> Box<[i8; 3]>,
           box_4: fn () -> Box<[i8; 4]>) {
    println!("Hello World 1");
    let _: Box<[i8]> = match 3 {
        1 => box_1(),
        2 => box_2(),
        3 => box_3(),
        _ => box_4(),
    };
    println!("Hello World 2");
}

pub fn main() {

    fn box_1() -> Box<[i8; 1]> { Box::new( [1i8] ) }
    fn box_2() -> Box<[i8; 2]> { Box::new( [1i8, 2] ) }
    fn box_3() -> Box<[i8; 3]> { Box::new( [1i8, 2, 3] ) }
    fn box_4() -> Box<[i8; 4]> { Box::new( [1i8, 2, 3, 4] ) }

    foo(box_1, box_2, box_3, box_4);
}

(The above gist has the advantage that it uses #![no_std] and uses an embedded allocator to track where the error arises. But sometimes you want something that works in the playpen.)

@pnkfelix
Copy link
Member

(The program from the previous comment also causes a core-dump in a Linux VM; Ubuntu 64-bit 14.10...)

pnkfelix added a commit to pnkfelix/rust that referenced this issue Jan 27, 2015
…s original L-/R-value state.

This fixes a subtle issue where temporaries were being allocated (but
not necessarily initialized) to the (parent) terminating scope of a
match expression; in particular, the code to zero out the temporary
emitted by `datum.store_to` is only attached to the particular
match-arm for that temporary, but when going down other arms of the
match expression, the temporary may falsely appear to have been
initialized, depending on what the stack held at that location, and
thus may have its destructor erroneously run at the end of the
terminating scope.

Test cases to appear in a follow-up commit.

Fix rust-lang#20055
pnkfelix added a commit to pnkfelix/rust that referenced this issue Jan 27, 2015
Note that I have not yet managed to expose any bug in
`trans::expr::into_fat_ptr`; it would be good to try to do so (or show
that the use of `.to_lvalue_datum` there is sound).
bors added a commit that referenced this issue Jan 29, 2015
trans: When coercing to `Box<Trait>` or `Box<[T]>`, leave datum in it's original L-/R-value state.

This fixes a subtle issue where temporaries were being allocated (but not necessarily initialized) to the (parent) terminating scope of a match expression; in particular, the code to zero out the temporary emitted by `datum.store_to` is only attached to the particular match-arm for that temporary, but when going down other arms of the match expression, the temporary may falsely appear to have been initialized, depending on what the stack held at that location, and thus may have its destructor erroneously run at the end of the terminating scope.

FIx #20055.

(There may be a latent bug still remaining in `fn into_fat_ptr`, but I am so annoyed by the test/run-pass/coerce_match.rs failures that I want to land this now.)
@frewsxcv
Copy link
Member Author

Awesome work @pnkfelix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants