Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the necessary changes to support concurrency in Miri. #70598

Merged
merged 8 commits into from
Apr 20, 2020

Conversation

vakaras
Copy link
Contributor

@vakaras vakaras commented Mar 31, 2020

This pull request makes the necessary changes to the Rust compiler to allow Miri to support concurrency:

  1. Move stack from the interpretation context (InterpCx) to machine, so that the machine can switch the stacks when it changes the thread being executed.
  2. Add the callbacks that allow the machine to generate fresh allocation ids for each thread local allocation and to translate them back to original allocations when needed. This allows the machine to ensure the property that allocation ids are unique, which allows using a simpler representation of the memory.

r? @oli-obk

cc @RalfJung

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 31, 2020
@RalfJung
Copy link
Member

Oh wow, thanks a lot! I expected this to take way longer. ;)

I'd like to review this, too, before it lands. But it may take a week or two until I find the time.
Some notes for now:

  • Could you prepare a Miri PR that just makes Miri work again as before with the Machine changes? I'd like to not block "getting Miri to work again" on Implement basic support for concurrency (Linux/macos only) miri#1284.
  • Could you describe at a high level what the plan is for data races? You didn't mention them, so I guess Miri will not detect data races even when executing concurrent programs. That is a rather big problem, as it means Miri will happily execute such code but miss the UB. At the very least, this needs to be called out in the README; maybe there should even be a warning emitted when the first thread is spawned, or so.
  • I don't understand what you said here about TLS. I see no reason why any translation should be needed. The different threads just each make their own allocation for their thread-local storage, and have different allocation IDs for that, and all is fine. This is also what happens on real machines: different thread's TLS has different physical addresses. This is important as one thread can take a pointer to its TLS and send that to another thread.

@vakaras
Copy link
Contributor Author

vakaras commented Mar 31, 2020

Oh wow, thanks a lot! I expected this to take way longer. ;)

I tried to make a minimal first step. There is still a lot of work to do, but my hope is that all of it will be on the Miri side.

* Could you prepare a Miri PR that just makes Miri work again as before
with the `Machine` changes? I'd like to not block "getting Miri to work
again" on
[rust-lang/miri#1284](https://github.com/rust-lang/miri/pull/1284).

Currently, each PR I create needs to get an approval from our legal department, which may take up to a week. If you think that it is still worth doing this, please let me know.

* Could you describe at a high level what the plan is for data races? You
didn't mention them, so I guess Miri will not detect data races even when
executing concurrent programs. That is a rather big problem, as it means
Miri will happily execute such code but miss the UB. At the very least,
this needs to be called out in the README; maybe there should even be a
warning emitted when the first thread is spawned, or so.

Data races are currently not supported. I think that adding a vector-clock based data race detection to all memory accesses is likely to significantly degrade the performance. I will add a comment to README and a warning as you suggest.

By the way, do stacked borrows rule out data-races except on UnsafeCell?

* I don't understand what you said here about TLS. I see no reason why
any translation should be needed.

I started learning about TLS only when I started working on this PR, so it is very likely that I do not understand something.

The different threads just each make their own allocation for their
thread-local storage, and have different allocation IDs for that, and all is
fine. This is also what happens on real machines: different thread's TLS has
different physical addresses. This is important as one thread can take a
pointer to its TLS and send that to another thread.

Could you please point me to the code that is supposed to do this thread-local allocation? The reason why I changed canonical_alloc_id to generate new allocation ids for thread locals is that get_raw_mut seems to get an allocation id that is a GlobalAlloc::Static and is the same for all threads. If we used this static id as a key in alloc_map (relevant code), all threads would get the same allocation, which is not what we want. One option would be to make the alloc_map aware of the thread local storage, but this is a big change (this was the first design I tried) and it does not have a property that allocation ids are unique, which, for example, breaks stacked borrows.

Is my reasoning clear and do you see what I missed?

@bjorn3
Copy link
Member

bjorn3 commented Mar 31, 2020

By the way, do stacked borrows rule out data-races except on UnsafeCell?

With raw pointers it is possible to create data-races even with stacked borrows.

@RalfJung
Copy link
Member

RalfJung commented Mar 31, 2020

Currently, each PR I create needs to get an approval from our legal department, which may take up to a week. If you think that it is still worth doing this, please let me know.

oO

The thing is, I want to avoid a situation where Miri does not work with rustc master. The Miri-side PR will likely take a while to review as it is much bigger, and new shims just tend to be a lot of effort (plus there is usually some first-time-contributor effect). If we cannot land this one here without breaking Miri or blocking on that big review, we have to wait until both PRs are completely reviewed before landing anything. This might mean more rebasing for you on the rustc side.

I have a preference for being able to land independent changes, but if you are fine with doing the rebases and not landing the rustc PR until both PRs are fully reviewed, I could live with that.

I hope you don't need to wait another week for each adjustment we request in this PR?

By the way, do stacked borrows rule out data-races except on UnsafeCell?

If you also exclude raw pointers (as pointed out by @bjorn3), I think so, but have not formally proven it.

Could you please point me to the code that is supposed to do this thread-local allocation? The reason why I changed canonical_alloc_id to generate new allocation ids for thread locals is that get_raw_mut seems to get an allocation id that is a GlobalAlloc::Static and is the same for all threads. If we used this static id as a key in alloc_map (relevant code), all threads would get the same allocation, which is not what we want. One option would be to make the alloc_map aware of the thread local storage, but this is a big change (this was the first design I tried) and it does not have a property that allocation ids are unique, which, for example, breaks stacked borrows.

You clearly know more about TLS than I do; I didn't even know "thread-local statics" were a thing. When I referred to thread-local storage I was thinking of the APIs we are already shimming in Miri via shims/tls.rs.
So, LLVM has some support for marking a static as "thread-local", and we now have to emulate that somehow in Miri? Well that will be fun.^^ In that case I agree something needs to happen on the machine level. Is there good documentation somewhere about how thread-local statics work?

@bjorn3
Copy link
Member

bjorn3 commented Mar 31, 2020

So, LLVM has some support for marking a static as "thread-local", and we now have to emulate that somehow in Miri?

Yes

#![feature(thread_local)]

#[thread_local]
static A: u8 = 0;

pub fn get_a_ref() -> *const u8 {
    &A
}
target triple = "x86_64-unknown-linux-gnu"

@_ZN10playground1A17hfc698d88543494bfE = internal thread_local constant <{ [1 x i8] }> zeroinitializer, align 1, !dbg !0

; playground::get_a_ref
; Function Attrs: nonlazybind uwtable
define i8* @_ZN10playground9get_a_ref17h42f05c46255e0587E() unnamed_addr #0 !dbg !11 {
start:
  ret i8* getelementptr inbounds (<{ [1 x i8] }>, <{ [1 x i8] }>* @_ZN10playground1A17hfc698d88543494bfE, i32 0, i32 0, i32 0), !dbg !15
}

In cg_clif the logic for tls can be found at https://github.com/bjorn3/rustc_codegen_cranelift/blob/c2e35604474aa25479dcca231fbc393a941234da/src/constant.rs#L64 to ensure that the const pointing to the static is directly codegened as taking the reference of the static rather than loading the from an allocation that contains the pointer. And at https://github.com/bjorn3/rustc_codegen_cranelift/blob/c2e35604474aa25479dcca231fbc393a941234da/src/constant.rs#L251 the static allocation is marked as thread local.

Is there good documentation somewhere about how thread-local statics work?

One of the things I used while implementing TLS for Cranelift was https://akkadia.org/drepper/tls.pdf. It describes the ABI level of TLS for ELF executables.

@bjorn3

This comment has been minimized.

@RalfJung

This comment has been minimized.

@bjorn3

This comment has been minimized.

@vakaras
Copy link
Contributor Author

vakaras commented Mar 31, 2020

I have a preference for being able to land independent changes, but if you are fine with doing the rebases and not landing the rustc PR until both PRs are fully reviewed, I could live with that.

I am fine with doing rebases.

@vakaras vakaras force-pushed the add-threads-cr3 branch 2 times, most recently from 89adcb6 to ce0a8ab Compare March 31, 2020 20:54
@rust-highfive

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 1, 2020

☔ The latest upstream changes (presumably #70672) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 2, 2020

If we implement #70685 before this PR the interactions will become much clearer, but we can also punt that to the future and merge this PR. I'm fine either way

@bors
Copy link
Contributor

bors commented Apr 3, 2020

☔ The latest upstream changes (presumably #70726) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk
Copy link
Contributor

oli-obk commented Apr 16, 2020

Also I'd prefer if you could await my review before landing this. :)

Sorry, I assumed since the last comments were on documentation and these were addressed, that this was ready to go.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you so much for working on this and doing some drive-by fixes (frame_idx).

r=me with the last nit fixed.

@RalfJung
Copy link
Member

@bors r=oli-obk,RalfJung

@bors
Copy link
Contributor

bors commented Apr 16, 2020

📌 Commit cbd9222 has been approved by oli-obk,RalfJung

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 16, 2020
@bors
Copy link
Contributor

bors commented Apr 19, 2020

⌛ Testing commit cbd9222 with merge 61ec4ad1640bdbdcfc01ab4be1a9f20445d6c823...

@Dylan-DPC-zz
Copy link

@bors retry yield

@bors
Copy link
Contributor

bors commented Apr 19, 2020

⌛ Testing commit cbd9222 with merge c6b55ee...

@bors
Copy link
Contributor

bors commented Apr 20, 2020

☀️ Test successful - checks-azure
Approved by: oli-obk,RalfJung
Pushing c6b55ee to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 20, 2020
@bors bors merged commit c6b55ee into rust-lang:master Apr 20, 2020
bors added a commit to rust-lang/miri that referenced this pull request Apr 20, 2020
Move the stack to the evaluator. (no-op PR for 70598)

The changes to Miri to make it to compile with Rustc PR rust-lang/rust#70598.
bors added a commit to rust-lang/miri that referenced this pull request Apr 20, 2020
Move the stack to the evaluator. (no-op PR for 70598)

The changes to Miri to make it to compile with Rustc PR rust-lang/rust#70598.
bors added a commit to rust-lang/miri that referenced this pull request Apr 20, 2020
Move the stack to the evaluator. (no-op PR for 70598)

The changes to Miri to make it to compile with Rustc PR rust-lang/rust#70598.
@jonhoo
Copy link
Contributor

jonhoo commented Apr 20, 2020

Oh, this looks really exciting from the title of the PR. Does this mean that thread::spawn will now "just work", or are the more missing pieces?

@RalfJung
Copy link
Member

RalfJung commented Apr 20, 2020

Yes it will just work (once rust-lang/miri#1284 lands). :) UB checks for synchronization primitives are incomplete, and there is no data race detection, but the code runs at least.

Just check the testcases in rust-lang/miri#1284. :D

@jonhoo
Copy link
Contributor

jonhoo commented Apr 20, 2020

That's amazing! I can't wait to try this for evmap and for flurry. Or for that matter, for tokio 🎉

bors added a commit to rust-lang/miri that referenced this pull request Apr 30, 2020
Implement basic support for concurrency (Linux/macos only)

Changes  (most new code is in `src/threads.rs` and `src/shims/foreign_items/posix.rs`):

1. Move the stack from `Machine` to a newly created `Thread` struct.
2. Add a `ThreadSet` struct that manages the threads.
3. Change `canonical_alloc_id` to create a unique allocation id for each thread local and thread (the responsible struct is `ThreadLocalStorage`)
4. Change the code to execute the thread local destructors immediately when a thread terminates.
5. Add the most basic round-robin scheduler.

This pull request depends on [these changes to the compiler](rust-lang/rust#70598).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants