Make AllocId decoding thread-safe #50957

Zoxc · 2018-05-22T02:17:42Z

This builds on top of #50520.

oli-obk · 2018-05-22T08:59:31Z

Repeating my worries from the other PR:

I am very unsure about [this]. We had a scheme like that when miri was merged, and we kept running into various edge cases with the decoding order. Even creating MCVEs for the panics we were getting was hard, because small changes in the code would change the order of evaluation.

Additionally I don't think we should do this at all, even considering it works now, because it will cause all those bugs again if we allow

let mut foo = Rc::new(RefCell::new(None));
let bar = Rc::new(RefCell::new(Some(foo.clone())));
*foo.borrow_mut() = Some(bar);

within constants. #49172 is a first step in that direction.

Any cyclic pointter structure inside constants will not work with the system proposed in this PR.

michaelwoerister · 2018-05-22T10:40:14Z

I want to take a closer look at this.

michaelwoerister · 2018-05-22T13:33:42Z

So, I think the difference between this and what we had before the table approach is that AllocKind::Alloc and AllocKind::AllocAtPos are two distinct cases now. That way we never encounter the case where the decoder would have to "skip ahead" when it decodes an already cached allocation.

I think it would also work for circular allocation graphs if we cache the pos -> AllocId mapping before encoding the allocation contents here.

michaelwoerister · 2018-05-22T14:15:27Z

src/librustc/mir/interpret/mod.rs

            trace!("encoding {:?} with {:#?}", alloc_id, alloc);
            AllocKind::Alloc.encode(encoder)?;
            alloc.encode(encoder)?;
+            cache(encoder).insert_same(alloc_id, pos);


insert_same() doesn't seem what we want here. If this could be reached in a racy way then pos would not necessarily be the same. It would also mean that two encoders would write to the same stream. Something like assert!(cache(encoder).insert(alloc_id, pos).is_none()) seems more appropriate. Correct me if I'm wrong.

michaelwoerister · 2018-05-22T14:35:53Z

OK, so I've reviewed 9d4d3e9 and it looks good to me. Do you still have objections, @oli-obk?

michaelwoerister · 2018-05-22T14:40:28Z

OK, so I've reviewed 9d4d3e9 and it looks good to me.

It looks good to me if we support encoding circular graphs, as noted above, that is...

oli-obk · 2018-05-22T15:18:40Z

So, I think the difference between this and what we had before the table approach is that AllocKind::Alloc and AllocKind::AllocAtPos are two distinct cases now. That way we never encounter the case where the decoder would have to "skip ahead" when it decodes an already cached allocation.

This is exactly the same situation we had before, except that AllocAtPos is now discriminant + u32 instead of just u32. The old code simply inlined the AllocAtPos variant into the discriminant.

The case I mentioned will still happen. Imagine the following steps:

reach an AllocAtPos, so you do a decoder.with_position
the allocation you a decoding leads to you decoding an AllocId via AllocAtPos
You follow the AllocAtPos, start decoding and reach another AllocId, this time AllocKind::Alloc, and it happens to be the one that 1. also pointed to.

Since thinking about this tends to fry my brain, I created a google doc illustrating the issue: https://docs.google.com/presentation/d/1AWwnDxuZKZgj1PvWo5mPiwhapmV5h3bjUVn-De-tpKc/edit?usp=sharing

I think it would also work for circular allocation graphs if we cache the pos -> AllocId mapping before encoding the allocation contents here.

While you can pre-cache the AllocId, that doesn't help you here, since you don't know how many bytes you need to skip ahead.

We also cannot encode this skip bytes amount, because at encoding time we don't know how far they are. We could reserve 4 bytes and write back the skip amount later, but that'll get horrible fast.

That said. I think we should just do this, because as @Zoxc correctly pointed out to me some time ago, we will (in the future) refactor AllocId to be

enum AllocId<'tcx> {
    Static(DefId),
    Function(Instance<'tcx>),
    Local(u64),
}

where Local refers to a constant-local id. This means that constants cannot contain pointers into other constants anymore, which is totally fine, since we can just copy the entire constant's memory. This won't do any actual copying, because the Allocations are interned so we still point to the very same physical memory in RAM, but it'll appear to have a different AllocId in the interpreter.

oli-obk · 2018-05-22T15:53:21Z

Oh that said, yes please insert loads of sanity checks as @michaelwoerister already pointed out. I'd rather have sensible assertions triggerd than really weird decoding errors later in the pipeline.

This mainly means asserting that the return value of any insert or remove operation is the expected one.

michaelwoerister · 2018-05-22T16:01:49Z

@oli-obk, I'm wondering if case 3 in your presentation wouldn't just work (although it would decode Alloc(99) twice):

Decode(AtPos(99))             <-- reserve/cache 99
  Decode(Alloc(99))           
    Decode(AtPos(42))         <-- reserve/cache 42
      Decode(Alloc(42))
        Decode(Alloc(99))
          Decode(AtPos(42))   <-- cache hit 42
          Done(Alloc(99))     <-- Alloc(99) interned
        Done(Alloc(42))       <-- Alloc(42) interned
    Done(AtPos(42))           <-- cache[42] = Alloc(42)
  Done(Alloc(99))             <-- Alloc(99) interned (again)
Done(AtPos(99))               <-- cache[99] = Alloc(99)

yes please insert loads of sanity checks

Yes, please! :)

oli-obk · 2018-05-22T16:24:18Z

We are also creating real AllocIds, so we'd need to ensure we don't create new ones for the same allocation. And then we need to guarantee that the second interning produces the exact same Allocation.

This might get tricky, especially with multithreading being involved. I'll do another review wrt multithreading

oli-obk · 2018-05-22T16:28:08Z

src/librustc/mir/interpret/mod.rs

    let alloc_type: AllocType<'tcx, &'tcx Allocation> =
        tcx.alloc_map.lock().get(alloc_id).expect("no value for AllocId");
    match alloc_type {
        AllocType::Memory(alloc) => {
+            if let Some(alloc_pos) = cache(encoder).get(&alloc_id).cloned() {


This would need to be an atomic "get or insert" operation in order to prevent two threads that get here at the same time from both trying to encode alloc (I think this is the same as what @michaelwoerister mentioned below)

Has this been addressed?

This operation is effectively atomic since we have unique ownership of the encoder and the cache. This doesn't matter though as encoding isn't intended to be multithreaded.

src/librustc/mir/interpret/mod.rs

+                AllocKind::AllocAtPos.encode(encoder)?;
+                return encoder.emit_usize(alloc_pos);
+            }
+            let pos = encoder.position();


Zoxc · 2018-05-23T04:49:34Z

I think it would also work for circular allocation graphs if we cache the pos -> AllocId mapping before encoding the allocation contents here.

Yes. I've moved the insertion so that it should handle circular allocation graphs there.

Zoxc · 2018-05-23T04:55:57Z

There was a possibly race condition where one thread would decode an AllocId using the ``AllocKind::Allocpath and insert the new id in the cache. Another thread could then decoding the sameAllocId` using the `AllocAtPos` path, and it would see the id in the cache and exit, but the first thread may not have finished loading the allocation yet.

I've changed the way decoding works to deal with this. We now have 2 caches. One global and one for the current session. The global cache contains a flag which indicates if the AllocId was partially loaded (it was assigned an id) or fully loaded (the AllocId has an associated Allocation).

This PR does not attempt to make encoding thread-safe, as we currently only encode using a single thread.

rust-highfive · 2018-05-23T05:46:45Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

travis_time:start:test_incremental
Check compiletest suite=incremental mode=incremental (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:57:33] 
[00:57:33] running 88 tests
tal-verify-ich" "-Z" "incremental-queries" "--error-format" "json" "-Zui-testing" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental/issue-49595/issue_49595/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-Z" "query-dep-graph" "--test" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental/issue-49595/issue_49595/auxiliary"
[00:57:53] ------------------------------------------
[00:57:53] 
[00:57:53] ------------------------------------------
[00:57:53] stderr:
[00:57:53] stderr:
[00:57:53] ------------------------------------------
[00:57:53] thread 'main' panicked at 'internal error: entered unreachable code', librustc/mir/interpret/value.rs:197:61
[00:57:53] 
[00:57:53] error: internal compiler error: unexpected panic
[00:57:53] 
[00:57:53] 
[00:57:53] note: the compiler unexpectedly panicked. this is a bug.
[00:57:53] 
[00:57:53] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[00:57:53] note: rustc 1.28.0-dev running on x86_64-unknown-linux-gnu
[00:57:53] 
[00:57:53] 
[00:57:53] note: compiler flags: -Z incremental-verify-ich -Z incremental-queries -Z ui-testing -Z unstable-options -Z query-dep-graph -C incremental -C prefer-dynamic -C rpath
[00:57:53] 
[00:57:53] ------------------------------------------
[00:57:53] 
[00:57:53] thread '[incremental] incremental/issue-49595/issue_49595.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:3044:9
---
[00:57:53] 
[00:57:53] thread 'main' panicked at 'Some tests failed', tools/compiletest/src/main.rs:498:22
[00:57:53] 
[00:57:53] 
[00:57:53] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/incremental" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "incremental" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-3.9/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Zunstable-options " "--target-rustcflags" "-Crpath -O -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "3.9.1\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
[00:57:53] 
[00:57:53] 
[00:57:53] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[00:57:53] Build completed unsuccessfully in 0:14:59
[00:57:53] Build completed unsuccessfully in 0:14:59
[00:57:53] Makefile:58: recipe for target 'check' failed
[00:57:53] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:132a10a1
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

michaelwoerister · 2018-05-23T08:51:41Z

I've changed the way decoding works to deal with this. We now have 2 caches. [...]

Makes sense.

This PR does not attempt to make encoding thread-safe, as we currently only encode using a single thread.

Yes, at the moment we don't have an encoder that could work concurrently anyway.

The travis error suggests that it's trying to decode from an invalid position somewhere.

bors · 2018-05-23T14:35:13Z

☔ The latest upstream changes (presumably #50866) made this pull request unmergeable. Please resolve the merge conflicts.

rust-highfive · 2018-05-24T12:31:01Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:43:26] .....................................................i..............................................
[00:43:30] .........................................................................ii.........................
[00:43:36] ....................................................................................................
[00:43:42] ...................................................................................i................
[00:43:44] .iiiiiiiii...................................................
[00:43:44] 
[00:43:44] travis_fold:start:test_ui_nll
travis_time:start:test_ui_nll
Check compiletest suite=ui mode=ui compare_mode=nll (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
---
[00:44:31] .....................................................i..............................................
[00:44:35] .........................................................................ii.........................
[00:44:40] ....................................................................................................
[00:44:46] ...................................................................................i................
[00:44:48] ..iiiiiiiii..................................................
[00:44:48] 
[00:44:48]  finished in 63.758
[00:44:48] travis_fold:end:test_ui_nll

---
travis_time:start:test_incremental
Check compiletest suite=incremental mode=incremental (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:55:27] 
[00:55:27] running 88 tests
[00:55:47] .......................................................F................................
[00:55:47] thread 'main' panicked at 'Some tests failed', tools/compiletest/src/main.rs:498:22
[00:55:47] 
[00:55:47] ---- [incremental] incremental/issue-49595/issue_49595.rs stdout ----
[00:55:47] 
[00:55:47] 
[00:55:47] error in revision `cfail2`: test compilation failed although it shouldn't!
[00:55:47] status: exit code: 101
[00:55:47] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/incremental/issue-49595/issue_49595.rs" "--target=x86_64-unknown-linux-gnu" "--cfg" "cfail2" "-C" "incremental=/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental/issue-49595/issue_49595/issue_49595.inc" "-Z" "incremental-verify-ich" "-Z" "incremental-queries" "--error-format" "json" "-Zui-testing" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental/issue-49595/issue_49595/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-Z" "query-dep-graph" "--test" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental/issue-49595/issue_49595/auxiliary"
[00:55:47] ------------------------------------------
[00:55:47] 
[00:55:47] ------------------------------------------
[00:55:47] stderr:
[00:55:47] stderr:
[00:55:47] ------------------------------------------
[00:55:47] thread 'main' panicked at 'internal error: entered unreachable code', librustc/mir/interpret/value.rs:197:61
[00:55:47] 
[00:55:47] error: internal compiler error: unexpected panic
[00:55:47] 
[00:55:47] 
[00:55:47] note: the compiler unexpectedly panicked. this is a bug.
[00:55:47] 
[00:55:47] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[00:55:47] note: rustc 1.28.0-dev running on x86_64-unknown-linux-gnu
[00:55:47] 
[00:55:47] 
[00:55:47] note: compiler flags: -Z incremental-verify-ich -Z incremental-queries -Z ui-testing -Z unstable-options -Z query-dep-graph -C incremental -C prefer-dynamic -C rpath
[00:55:47] 
[00:55:47] ------------------------------------------
[00:55:47] 
[00:55:47] thread '[incremental] incremental/issue-49595/issue_49595.rs' panicked at 'explicit panic', tools/compiletest/src/runtest.rs:3053:9
---
[00:55:47] test result: FAILED. 87 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
[00:55:47] 
[00:55:47] 
[00:55:47] 
[00:55:47] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/incremental" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/incremental" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "incremental" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-3.9/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Zunstable-options " "--target-rustcflags" "-Crpath -O -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "3.9.1\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
[00:55:47] 
[00:55:47] 
[00:55:47] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[00:55:47] Build completed unsuccessfully in 0:14:31
[00:55:47] Build completed unsuccessfully in 0:14:31
[00:55:47] Makefile:58: recipe for target 'check' failed
[00:55:47] make: *** [check] Error 1
104168 ./obj/build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu
104164 ./obj/build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release
103608 ./obj/build/x86_64-unknown-linux-gnu/stage0/lib/rustlib/x86_64-unknown-linux-gnu/codegen-backends
103228 ./obj/build/bootstrap/debug/incremental/bootstrap-c730863262pt
103228 ./obj/build/bootstrap/debug/incremental/bootstrap-c730863262pt
103224 ./obj/build/bootstrap/debug/incremental/bootstrap-c730863262pt/s-f1c1jm4hc1-16gfdfi-2h3qirbcc1hzj
91892 ./obj/build/x86_64-unknown-linux-gnu/stage1
91868 ./obj/build/x86_64-unknown-linux-gnu/stage1/lib
89804 ./src/llvm/test/CodeGen
89412 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

michaelwoerister · 2018-05-24T13:15:58Z

Here's a backtrace for the ICE.

thread 'main' panicked at 'internal error: entered unreachable code', librustc/mir/interpret/value.rs:197:61
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::panicking::default_hook::{{closure}}
             at libstd/sys_common/backtrace.rs:71
             at libstd/sys_common/backtrace.rs:59
             at libstd/panicking.rs:211
   2: std::panicking::default_hook
             at libstd/panicking.rs:227
   3: rustc::util::common::panic_hook
             at librustc/util/common.rs:54
   4: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:467
   5: std::panicking::begin_panic
             at ./src/libstd/panicking.rs:397
   6: serialize::serialize::Decoder::read_enum
             at librustc/mir/interpret/value.rs:197
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/interpret/value.rs:197
             at ./src/libserialize/serialize.rs:168
   7: serialize::serialize::Decoder::read_enum
             at librustc/mir/interpret/value.rs:197
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/mir/interpret/value.rs:15
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/interpret/value.rs:10
             at ./src/libserialize/serialize.rs:168
   8: serialize::serialize::Decoder::read_enum
             at librustc/mir/interpret/value.rs:10
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/middle/const_val.rs:28
             at ./src/libserialize/serialize.rs:175
             at librustc/middle/const_val.rs:25
             at ./src/libserialize/serialize.rs:168
   9: serialize::serialize::Decoder::read_enum
             at librustc/middle/const_val.rs:25
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:205
             at librustc/ty/sty.rs:1765
             at ./src/libserialize/serialize.rs:199
             at librustc/ty/sty.rs:1761
             at librustc/ty/codec.rs:263
             at librustc/ty/codec.rs:403
             at ./src/libserialize/serialize.rs:850
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/mir/mod.rs:1849
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/mod.rs:1846
             at ./src/libserialize/serialize.rs:168
  10: serialize::serialize::Decoder::read_struct
             at librustc/mir/mod.rs:1846
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:205
             at librustc/mir/mod.rs:1840
             at ./src/libserialize/serialize.rs:199
  11: serialize::serialize::Decoder::read_enum
             at librustc/mir/mod.rs:1836
             at ./src/libserialize/serialize.rs:511
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/mir/mod.rs:1525
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/mod.rs:1512
             at ./src/libserialize/serialize.rs:168
  12: serialize::serialize::Decoder::read_enum
             at librustc/mir/mod.rs:1512
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/mir/mod.rs:1570
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/mod.rs:1567
             at ./src/libserialize/serialize.rs:168
  13: serialize::serialize::Decoder::read_enum
             at librustc/mir/mod.rs:1567
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:179
             at librustc/mir/mod.rs:1221
             at ./src/libserialize/serialize.rs:175
             at librustc/mir/mod.rs:1218
             at ./src/libserialize/serialize.rs:168
  14: serialize::serialize::Decoder::read_struct
             at librustc/mir/mod.rs:1218
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:205
             at librustc/mir/mod.rs:1199
             at ./src/libserialize/serialize.rs:199
  15: serialize::serialize::Decoder::read_seq
             at librustc/mir/mod.rs:1196
             at ./src/libserialize/serialize.rs:563
             at ./src/libserialize/serialize.rs:248
             at ./src/libserialize/serialize.rs:563
             at ./src/libserialize/serialize.rs:245
  16: serialize::serialize::Decoder::read_struct
             at ./src/libserialize/serialize.rs:560
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:205
             at librustc/mir/mod.rs:686
             at ./src/libserialize/serialize.rs:199
  17: serialize::serialize::Decoder::read_seq
             at librustc/mir/mod.rs:683
             at ./src/libserialize/serialize.rs:563
             at ./src/libserialize/serialize.rs:248
             at ./src/libserialize/serialize.rs:563
             at ./src/libserialize/serialize.rs:245
  18: <rustc::mir::Mir<'tcx> as serialize::serialize::Decodable>::decode::{{closure}}
             at ./src/libserialize/serialize.rs:560
             at ./src/librustc_data_structures/indexed_vec.rs:350
             at ./src/libcore/ops/function.rs:223
             at ./src/libserialize/serialize.rs:205
             at librustc/mir/mod.rs:79
  19: <rustc::ty::maps::queries::optimized_mir<'tcx> as rustc::ty::maps::config::QueryDescription<'tcx>>::try_load_from_disk
             at ./src/libserialize/serialize.rs:199
             at librustc/mir/mod.rs:75
             at librustc/ty/maps/on_disk_cache.rs:506
             at librustc/ty/maps/on_disk_cache.rs:396
             at librustc/ty/maps/on_disk_cache.rs:342
             at librustc/ty/maps/config.rs:702
  20: rustc::ty::maps::plumbing::<impl rustc::ty::context::TyCtxt<'a, 'gcx, 'tcx>>::get_query
             at librustc/ty/maps/plumbing.rs:440
             at librustc/ty/maps/plumbing.rs:406
             at librustc/ty/maps/plumbing.rs:603
             at librustc/ty/maps/plumbing.rs:610
  21: rustc::ty::maps::plumbing::<impl rustc::dep_graph::dep_node::DepNode>::load_from_on_disk_cache
             at librustc/ty/maps/plumbing.rs:780
             at librustc/ty/maps/plumbing.rs:773
             at librustc/ty/maps/plumbing.rs:1189
  22: rustc::dep_graph::graph::DepGraph::exec_cache_promotions
             at librustc/dep_graph/graph.rs:815
  23: rustc::ty::context::tls::with_context::{{closure}}
             at ./src/librustc/ty/maps/on_disk_cache.rs:203
             at ./src/librustc/dep_graph/graph.rs:166
             at ./src/librustc/ty/context.rs:1725
             at ./src/librustc/ty/context.rs:1666
             at ./src/librustc/ty/context.rs:1724
             at ./src/librustc/dep_graph/graph.rs:165
             at ./src/librustc/ty/context.rs:1770
  24: rustc::util::common::time
             at ./src/librustc/ty/context.rs:1761
             at ./src/librustc/ty/context.rs:1770
             at ./src/librustc/dep_graph/graph.rs:159
             at ./src/librustc/ty/maps/on_disk_cache.rs:173
             at ./src/librustc/ty/context.rs:1340
             at librustc_incremental/persist/save.rs:256
             at ./src/librustc/util/common.rs:166
             at ./src/librustc/util/common.rs:160
  25: rustc_incremental::persist::save::save_in
             at librustc_incremental/persist/save.rs:255
             at librustc_incremental/persist/save.rs:39
             at librustc_incremental/persist/save.rs:120
  26: rustc::util::common::time
             at librustc_incremental/persist/save.rs:37
             at ./src/librustc/util/common.rs:166
             at ./src/librustc/util/common.rs:160
  27: rustc_incremental::persist::save::save_dep_graph
             at librustc_incremental/persist/save.rs:36
             at ./src/librustc/dep_graph/graph.rs:166
             at ./src/librustc/ty/context.rs:1725
             at ./src/librustc/ty/context.rs:1666
             at ./src/librustc/ty/context.rs:1724
             at ./src/librustc/dep_graph/graph.rs:165
             at ./src/librustc/ty/context.rs:1770
             at ./src/librustc/ty/context.rs:1761
             at ./src/librustc/ty/context.rs:1770
             at ./src/librustc/dep_graph/graph.rs:159
             at librustc_incremental/persist/save.rs:30
  28: rustc_codegen_llvm::base::codegen_crate
             at librustc_codegen_llvm/base.rs:955
             at librustc_codegen_llvm/base.rs:946
  29: <rustc_codegen_llvm::LlvmCodegenBackend as rustc_codegen_utils::codegen_backend::CodegenBackend>::codegen_crate
             at librustc_codegen_llvm/lib.rs:204
  30: rustc_driver::driver::phase_4_codegen
             at librustc_driver/driver.rs:1247
             at ./src/librustc/util/common.rs:166
             at ./src/librustc/util/common.rs:160
             at librustc_driver/driver.rs:1247
  31: rustc_driver::driver::compile_input::{{closure}}
             at librustc_driver/driver.rs:317
  32: rustc::ty::context::tls::enter_context
             at librustc_driver/driver.rs:1231
             at ./src/librustc/ty/context.rs:1748
             at ./src/librustc/ty/context.rs:1725
             at ./src/librustc/ty/context.rs:1666
             at ./src/librustc/ty/context.rs:1724
  33: <std::thread::local::LocalKey<T>>::with
             at ./src/librustc/ty/context.rs:1747
             at ./src/librustc/ty/context.rs:1714
             at ./src/libstd/thread/local.rs:294
             at ./src/libstd/thread/local.rs:248
             at ./src/librustc/ty/context.rs:1706
             at ./src/libstd/thread/local.rs:294
             at ./src/libstd/thread/local.rs:248
  34: rustc::ty::context::TyCtxt::create_and_enter
             at ./src/librustc/ty/context.rs:1698
             at ./src/librustc/ty/context.rs:1736
             at ./src/librustc/ty/context.rs:1178
  35: rustc_driver::driver::compile_input
             at librustc_driver/driver.rs:1141
             at librustc_driver/driver.rs:276
  36: rustc_driver::run_compiler_with_pool
             at librustc_driver/lib.rs:551
  37: syntax::with_globals
             at librustc_driver/lib.rs:472
             at librustc_driver/driver.rs:72
             at librustc_driver/lib.rs:471
             at /home/mw/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/scoped-tls-0.1.1/src/lib.rs:155
             at ./src/libsyntax/lib.rs:96
             at /home/mw/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/scoped-tls-0.1.1/src/lib.rs:155
             at ./src/libsyntax/lib.rs:95
  38: rustc_driver::monitor::{{closure}}
             at librustc_driver/lib.rs:462
             at librustc_driver/lib.rs:1695
             at librustc_driver/lib.rs:180
             at librustc_driver/lib.rs:1609
  39: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:105
  40: std::panicking::try
             at ./src/libstd/panicking.rs:289
  41: rustc_driver::run
             at ./src/libstd/panic.rs:374
             at librustc_driver/lib.rs:1541
             at librustc_driver/lib.rs:1608
             at librustc_driver/lib.rs:179
  42: rustc_driver::main
             at librustc_driver/lib.rs:1688
  43: std::rt::lang_start::{{closure}}
             at ./src/libstd/rt.rs:74
  44: std::panicking::try::do_call
             at libstd/rt.rs:59
             at libstd/panicking.rs:310
  45: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:105
  46: std::panicking::try
             at libstd/panicking.rs:289
  47: std::rt::lang_start_internal
             at libstd/panic.rs:374
             at libstd/rt.rs:58
  48: main
  49: __libc_start_main
  50: _start
``

michaelwoerister · 2018-05-24T15:54:33Z

OK, I think this is the issue @oli-obk has been warning about all along. The code does the following:

It decodes AllocAtPos at 1234, which caches 1234.
Later it directly decodes Alloc at 1234 (where it was embedded the first time) and hits the cache, leaving the decoder at the position of the allocation where it is expected to be after the allocation.
Decoding the data after the allocation thus fails.

oli-obk · 2018-05-24T19:18:08Z

Oh right. this has nothing to do with recursive allocs, just with a different order of encoding and decoding (of completely unrelated objects that contain the same AllocId)

oli-obk · 2018-05-24T19:21:11Z

Though without recursion you can cache the end position, too, and everything should work (It did for the old impl, just died on recursion then, because you didn't know the end position yet when you got to the actual allocation)

oli-obk · 2018-05-24T19:22:44Z

Oh that won't work for parallel decoding :(

Any ideas @Zoxc ?

michaelwoerister · 2018-05-24T19:35:50Z

I think we should stick to the table based approach. I'm sure it can be made to work with parallel decoding.

oli-obk · 2018-05-25T05:43:27Z

src/librustc/mir/interpret/mod.rs

+                // Create an id which is not fully loaded
+                (tcx.alloc_map.lock().reserve(), false)
+            });
+            if fully_loaded || !local_cache(decoder).insert(alloc_id) {


What happens if one thread tstarts decoding, the next thread takes over the CPU, gets here for the same AllocId, skips over and tries to access the allocation? It'll ICE about uncached alloc or error with dangling pointer, right?

In the fully_loaded case this isn't a problem since the AllocId has an Allocation assigned. For the !local_cache(decoder).insert(alloc_id) case, we know that some stack frame above us will assign an AllocId before the result will be used. Since local_cache is thread local another thread won't see the value inserted here. It may instead decode the same allocation in parallel.

Ah, neat. Please make this explanation a comment on that if statement

rust-highfive · 2018-05-25T05:55:16Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

[00:23:11]    Compiling syntax_pos v0.0.0 (file:///checkout/src/libsyntax_pos)
[00:23:15]    Compiling rustc_errors v0.0.0 (file:///checkout/src/librustc_errors)
[00:24:15]    Compiling proc_macro v0.0.0 (file:///checkout/src/libproc_macro)
[00:24:26]    Compiling syntax_ext v0.0.0 (file:///checkout/src/libsyntax_ext)
[00:26:18] thread 'main' panicked at 'internal error: entered unreachable code', librustc/mir/interpret/mod.rs:164:10
[00:26:19] 
[00:26:19] error: internal compiler error: unexpected panic
[00:26:19] 
[00:26:19] 
[00:26:19] note: the compiler unexpectedly panicked. this is a bug.
[00:26:19] 
[00:26:19] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[00:26:19] note: rustc 1.28.0-dev running on x86_64-unknown-linux-gnu
[00:26:19] 
[00:26:19] 
[00:26:19] note: compiler flags: -Z force-unstable-if-unmarked -C prefer-dynamic -C opt-level=3 -C prefer-dynamic -C debug-assertions=y -C link-args=-Wl,-rpath,$ORIGIN/../lib --crate-type dylib
[00:26:19] 
[00:26:19] note: some of the compiler flags provided by cargo are hidden
[00:26:19] error: Could not compile `rustc`.
[00:26:19] 
[00:26:19] Caused by:
[00:26:19] Caused by:
[00:26:19]   process didn't exit successfully: `/checkout/obj/build/bootstrap/debug/rustc --crate-name rustc librustc/lib.rs --color always --error-format json --crate-type dylib --emit=dep-info,link -C prefer-dynamic -C opt-level=3 -C metadata=62d74f21a64f8c0c -C extra-filename=-62d74f21a64f8c0c --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps --target x86_64-unknown-linux-gnu -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps -L dependency=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/release/deps --extern tempdir=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libtempdir-450feded456a4278.rlib --extern lazy_static=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/liblazy_static-3846f1b0424591fd.rlib --extern jobserver=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libjobserver-3720d8c52a6bc989.rlib --extern graphviz=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libgraphviz-f21bfea456e2feba.so --extern proc_macro=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libproc_macro-5c863390141836fe.so --extern syntax_pos=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libsyntax_pos-70b92be3dfddcce2.so --extern byteorder=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libbyteorder-270afc7a968c2570.rlib --extern flate2=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libflate2-ff77786b985e61bc.rlib --extern syntax=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libsyntax-9dea40d5c994cba1.so --extern rustc_errors=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/librustc_errors-e7bbb7d6e0541d97.so --extern rustc_data_structures=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/librustc_data_structures-3762ade15a64029b.so --extern bitflags=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps/libbitflags-575f47f158b62d9a.rlib --extern log=/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64own-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" " jemalloc" "--manifest-path" "/checkout/src/rustc/Cargo.toml" "--message-format" "json"
[00:26:19] expected success, got: exit code: 101
[00:26:19] thread 'main' panicked at 'cargo must succeed', bootstrap/compile.rs:1091:9
[00:26:19] travis_fold:end:stage1-rustc

[00:26:19] travis_time:end:stage1-rustc:start=1527227484793947373,finish=1527227712491800476,duration=227697853103


[00:26:19] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap build
[00:26:19] Build completed unsuccessfully in 0:21:36
[00:26:19] Makefile:28: recipe for target 'all' failed
[00:26:19] make: *** [all] Error 1
70300 ./obj/build/x86_64-unknown-linux-gnu/native/jemalloc
68788 ./src/llvm/lib
65420 ./src/llvm-emscripten/test/CodeGen
61608 ./obj/build/x86_64-unknown-linux-gnu/stage0-rustc/release

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

michaelwoerister · 2018-05-25T08:26:11Z

src/librustc/mir/interpret/mod.rs

+
+            // Write placeholder for size
+            let size_pos = encoder.position();
+            0usize.encode(encoder)?;


This doesn't work because of variable-length integer encoding.

rustc has some similar code elsewhere and works around this by using a 4 byte array that the size is encoded into. Somewhat space-wasteful though.

It might be simpler to just remember the in the global_cache during decoding?

that doesn't work, because we need this value also when another thread hasn't finished decoding the allocation yet.

oli-obk · 2018-05-25T08:30:19Z

src/librustc/mir/interpret/mod.rs

    match AllocKind::decode(decoder)? {
+        AllocKind::AllocAtPos => {


You could use the trick here that I had originally where you read a usize, and that tag is either 0 for Alloc, 1 for Static, 2 for Fn or anything else is the real_pos. This is also used in Ty encoding I think.

oli-obk · 2018-05-25T09:49:38Z

Before @Zoxc does more work here, we should decide whether

I think we should stick to the table based approach. I'm sure it can be made to work with parallel decoding.

is an option. It is certainly the more easily grokked option. What exactly is blocking that solution from allowing parallel decoding?

Isn't the table based solution essentially equivalent to this solution but it's AllocAtPos always, there's no Alloc variant?

michaelwoerister · 2018-05-25T11:32:46Z

@oli-obk, do you remember why exactly we switched to the table-based approach? Because you already had the skipping implemented but that may not have been sufficient for all cases.

michaelwoerister · 2018-05-25T11:36:27Z

The table-based approach also might work better if we ever want to make the cache updateable in-place. I know that was one of the reasons why we wanted it.

rust-highfive · 2018-05-25T12:02:40Z

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

bjorn3 · 2018-05-25T12:38:48Z

@TimNN @rust-highfive log is empty

oli-obk · 2018-05-25T13:33:42Z

Because you already had the skipping implemented but that may not have been sufficient for all cases.

The skipping failed for recursive cases, that's where we decided to scrap the approach for thetable based version, since iirc that's what we wanted all along.

The table-based approach also might work better if we ever want to make the cache updateable in-place

Oh yea, that would definitely require an indirection like the one used currently

oli-obk · 2018-05-25T13:34:48Z

@Zoxc

[00:24:29] thread 'main' panicked at 'assertion failed: hi >= filemap.original_start_pos && hi <= filemap.original_end_pos', librustc_metadata/decoder.rs:356:9

michaelwoerister · 2018-05-25T14:24:31Z

FYI, I'm looking into an alternative implementation of this right now.

@Zoxc

WIP: Make const decoding thread-safe. This is an alternative to #50957. It's a proof of concept (e.g. it doesn't adapt metadata decoding, just the incr. comp. cache) but I think it turned out nice. It's rather simple and does not require passing around a bunch of weird closures, like we currently do. If you (@Zoxc & @oli-obk) think this approach is good then I'm happy to finish and clean this up. Note: The current version just spins when it encounters an in-progress decoding. I don't have a strong preference for this approach. Decoding concurrently is equally fine by me (or maybe even better because it doesn't require poisoning). r? @Zoxc

@Zoxc

Make const decoding thread-safe. This is an alternative to #50957. It's a proof of concept (e.g. it doesn't adapt metadata decoding, just the incr. comp. cache) but I think it turned out nice. It's rather simple and does not require passing around a bunch of weird closures, like we currently do. If you (@Zoxc & @oli-obk) think this approach is good then I'm happy to finish and clean this up. Note: The current version just spins when it encounters an in-progress decoding. I don't have a strong preference for this approach. Decoding concurrently is equally fine by me (or maybe even better because it doesn't require poisoning). r? @Zoxc

bors · 2018-06-01T10:55:55Z

☔ The latest upstream changes (presumably #51060) made this pull request unmergeable. Please resolve the merge conflicts.

oli-obk · 2018-06-01T11:37:57Z

the alternative to this PR (#51060) has been merged

rust-highfive assigned oli-obk May 22, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 22, 2018

michaelwoerister reviewed May 22, 2018

View reviewed changes

oli-obk reviewed May 22, 2018

View reviewed changes

Zoxc force-pushed the alloc-sync branch from 9d4d3e9 to 8131934 Compare May 22, 2018 20:53

Zoxc force-pushed the alloc-sync branch from 109febb to 16da5a4 Compare May 24, 2018 11:33

Zoxc added 3 commits May 25, 2018 07:27

Make AllocId decoding thread-safe

24a40da

Fixes

94eef34

Encode the allocation size so we can skip ahead

398d282

Zoxc force-pushed the alloc-sync branch from bbf53c7 to 398d282 Compare May 25, 2018 05:27

oli-obk reviewed May 25, 2018

View reviewed changes

michaelwoerister reviewed May 25, 2018

View reviewed changes

oli-obk reviewed May 25, 2018

View reviewed changes

Encode sizes using arrays

4adcc57

michaelwoerister mentioned this pull request May 25, 2018

Make const decoding thread-safe. #51060

Merged

oli-obk closed this Jun 1, 2018

		match AllocKind::decode(decoder)? {
		AllocKind::AllocAtPos => {

Make AllocId decoding thread-safe #50957

Make AllocId decoding thread-safe #50957

Conversation

Zoxc commented May 22, 2018

oli-obk commented May 22, 2018

michaelwoerister commented May 22, 2018

michaelwoerister commented May 22, 2018

Choose a reason for hiding this comment

michaelwoerister commented May 22, 2018

michaelwoerister commented May 22, 2018

oli-obk commented May 22, 2018 • edited Loading

oli-obk commented May 22, 2018

michaelwoerister commented May 22, 2018 • edited Loading

oli-obk commented May 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

This comment was marked as resolved.

Zoxc commented May 23, 2018

Zoxc commented May 23, 2018

rust-highfive commented May 23, 2018

michaelwoerister commented May 23, 2018

bors commented May 23, 2018

rust-highfive commented May 24, 2018

michaelwoerister commented May 24, 2018

michaelwoerister commented May 24, 2018

oli-obk commented May 24, 2018

oli-obk commented May 24, 2018

oli-obk commented May 24, 2018

michaelwoerister commented May 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rust-highfive commented May 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oli-obk commented May 25, 2018 • edited Loading

michaelwoerister commented May 25, 2018

michaelwoerister commented May 25, 2018

rust-highfive commented May 25, 2018

bjorn3 commented May 25, 2018

oli-obk commented May 25, 2018

oli-obk commented May 25, 2018

michaelwoerister commented May 25, 2018

bors commented Jun 1, 2018

oli-obk commented Jun 1, 2018

oli-obk commented May 22, 2018 •

edited

Loading

michaelwoerister commented May 22, 2018 •

edited

Loading

oli-obk commented May 25, 2018 •

edited

Loading