Add safe wrapper for atomic_compilerfence intrinsics #41092

jonhoo · 2017-04-05T19:35:35Z

This PR adds a proposed safe wrapper for the atomic_singlethreadfence_* intrinsics introduced by RFC #888. See #41091 for further discussion.

rust-highfive · 2017-04-05T19:35:49Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

sfackler · 2017-04-05T21:19:10Z

Can these be added in a normal function? It seems like the intrinsic itself would have to be placed inline, right? Will the inline pass do the right thing?

jonhoo · 2017-04-05T22:09:59Z

That's a good point. You may be right that #[inline(always)] is necessary on the wrapper. It'd be good to loop in @tari who filed the original intrinsics PR, since I don't have enough in-depth knowledge about how the LLVM integration works to say for sure.

jonhoo · 2017-04-05T22:10:49Z

Also, travis fails because I added this under a new feature, and it says

Unstable feature 'std_compiler_fences' needs to have a section in The Unstable Book

I'll hold on with adding it there until we've figured out the basics.

alexcrichton · 2017-04-05T23:48:20Z

Looks good to me!

I'd be surprised if these were required to be inlined, but I also am unfamiliar with where you'd use them and the LLVM implementation.

tari · 2017-04-06T00:09:53Z

I wouldn't think it needs to be always inlined, since if LLVM doesn't have visibility into this function it would be incorrect to optimize memory accesses around the function since it couldn't prove there are no barriers inside it (which is true of any function). If it does have visibility this will probably get inlined, which is fine.

jonhoo · 2017-04-06T01:41:11Z

@tari Yeah, that matches my expectations too.
@alexcrichton and @sfackler, I just pushed an entry for the unstable book, and in the process renamed the unstable feature to compiler_barriers to match the function name. If Travis doesn't complain, I think I'm happy with this.

parched · 2017-04-06T07:43:11Z

src/doc/unstable-book/src/compiler-barriers.md

+`v1 == 1` (since `thread2`'s store happened before its `load`, and its
+load did not see `thread1`'s store). However, the code as written does
+*not* guarantee this, because the compiler is allowed to re-order the
+store and load within each thread. To enforce this particular behavior,


The hardware can still make this reordering regardless of what the compiler does. A better example might be something like

static FLAG: AtomicUsize = ATOMIC_USIZE_INIT; // thread1: // write some stuff non atomically // need single thread fence here FLAG.store(1, Ordering::Relaxed); // thread1 signal handler if (FLAG.load(Ordering::Relaxed) == 1) { // need single thread fence here // read stuff }

Ah, you're right. In fact, even the example above is wrong for the same reason (you would need to use volatile_*). Maybe I'm finding myself agreeing more with the original RFC's conclusion that this is not something that people should generally need (at least until Rust start supporting signal handlers)...

alexcrichton · 2017-04-07T02:46:10Z

Wow thanks for the thorough documentation @jonhoo! Looks great to me, but did you want to update with @parched's comment?

jonhoo · 2017-04-07T02:54:01Z

@alexcrichton Following his comments, I'm pretty sure all the examples I gave in the entry for the book are actually wrong, because the hardware is still free to re-arrange the loads and stores (you need to use volatile_read and volatile_write to make them correct). I'll try and come up with some better text to push, though with that realization (and as mentioned in the follow-up comment to @parched), I'm not as sure any more that this is sufficiently useful to have its own safe wrapper... Thoughts on that would be welcome.

brson · 2017-04-07T03:33:29Z

Excellent docs @jonhoo. Thanks for writing them.

jonhoo · 2017-04-07T03:37:50Z

Ultimately it might actually be good to move some of the explanation/examples from the unstable book entry into the documentation of compiler_barrier. Not sure if that's something we want to do now though? It would also make the documentation for that one function quite large (which maybe is okay?).

tari · 2017-04-07T11:30:54Z

I'm not as sure any more that this is sufficiently useful to have its own safe wrapper... Thoughts on that would be welcome.

While most (possibly all?) operations you might do that need a single-threaded fence will involve unsafe code, this change also implies a future state where use of these functions is stable whereas current use of them requires a nightly compiler- that alone makes it worthwhile in my view.

Whether the new function should be unsafe or not, I'd say it should be safe because it does not directly perform any potentially-unsafe operations (even though it will almost always accompany very unsafe operations).

A few other situations where you might have preemption on a single thread might be if you're implementing green threads or handling interrupts (say, with an extern "msp430-interrupt" or extern "x86-interrupt" function) and passing state on the same CPU.

jonhoo · 2017-04-07T13:14:12Z

@tari Sorry, I meant whether adding a wrapper in std is necessary/useful. But yes, I suppose you're right. As for interrupts/green threads, I added a sentence to that effect in e6597e1.

steveklabnik · 2017-04-07T14:19:38Z

Ultimately it might actually be good to move some of the explanation/examples from the unstable book entry into the documentation of compiler_barrier. Not sure if that's something we want to do now though? It would also make the documentation for that one function quite large (which maybe is okay?).

Once it's made stable, these docs would have to go somewhere. Large docs on a function is fine with me, personally. You could also move some of it to the module itself. This seems fine for now though.

jonhoo · 2017-04-07T15:26:52Z

@steveklabnik is there a documented process for how we transition from an unstable to a stable feature?

It seems Travis is also happy. Unless there are further objections (@tari? @parched?), I think this is ready to merge.

parched

You are correct in your original first example that people would normally use volatile_* but that relies on the implicit assumption that the locations are in strongly ordered memory which is usually the case for memory mapped peripherals like you described. So your compiler barrier would have worked and been better than volatile because if your code made 2 reads of the address the compiler could've just made it 1.

I was also wondering if you had a reason for choosing the name compiler_barrier rather than single_thread_fence like LLVM calls it. I'm unsure which I prefer.

parched · 2017-04-07T17:20:19Z

src/doc/unstable-book/src/compiler-barriers.md

+semantics, the compiler may be disallowed from moving reads or writes
+from before or after the call to the other side of the call to
+`compiler_barrier`. Note that it does **not** prevent the *hardware*
+from doing such re-orderings -- for that, the `volatile_*` class of


volatile doesn't help in that regard, it's just another form of compiler barrier really.

Ah, yes, you're right. I've modified the text locally, but waiting to push until we settle on the things below.

parched · 2017-04-07T17:28:14Z

src/doc/unstable-book/src/compiler-barriers.md

+`AtomicPtr` that starts as `NULL`), it may also be unsafe for the
+compiler to hoist code using `IMPORTANT_VARIABLE` above the
+`IS_READY.load`. In that case, a `compiler_barrier(Ordering::Acquire)`
+should be placed at the top of the `if` to prevent this optimizations.


I believe you need second compiler barrier regardless of the type. Also IMPORTANT_VARIABLE doesn't need to be atomic only the variable that you are synchronzing on (IS_READY).

Actually, come to think of it, I'm not sure either variable needs to be atomic? It's convenient because Rust doesn't easily have mutable statics otherwise, but this should also work correctly even if they're both "normal" memory locations.

As for the need for the second compiler barrier, I'm not sure I see why? It's fine for the load of IMPORTANT_VARIABLE to occur before the load of IS_READY as long as IMPORTANT_VARIABLE isn't used (which it wouldn't be if IS_READY=0), no?

IS_READY needs to be atomic otherwise it could be partially written by main leaving it in some indeterminate state when the signal handler runs and tries to read it.

Yes it's fine if it isn't used, but what if it is used when IS_READY == 1

It's fine for it to be used if IS_READY == 1 — the write to IMPORTANT_VARIABLE must have succeeded, so it should contain the right value. Even if you read it before you read IS_READY.

Ah yes, I see what you mean. I was thinking in the more general case where the context could switch back again between the two. Something like an RTOS on a single core, but in that case you should just use a memory fence instead because they will basically be free on single core, and when you go to multi core your code won't break.

For a signal handler like this the whole thing is effectively atomic with respect to thread1 which also means the load of IS_READY doesn't need to be atomic either, only the store does.

I'm not sure what you mean about the pointer though. The compiler can't hoist out the pointer dereference unless it already know it's a valid pointer, otherwise just normal single threaded code would break, wouldn't it?

Ah, yes, of course, you're right. I'll remove the point about pointers.

jonhoo · 2017-04-07T19:23:39Z

With regards to the old device memory location example, I'm not sure that's quite right. The compiler barrier doesn't prevent the hardware from loading the data before storing the address (I think?), because they aren't dependent, and thus their strict ordering isn't required. You are right that volatile might not have been sufficient either though.

For compiler_barrier vs single_thread_fence, I deemed the former to be more unambiguous. To me, single_thread_fence suggests that it actually works somewhat like a fence, in that it also prevents hardware re-ordering, whereas compiler_barrier very clearly does not have such an implication.

parched · 2017-04-07T22:48:53Z

Yes I mean if volatile was sufficient (because it was known to strongly ordered memory and hence no hardware reordering) then your example should have worked too and potentially had some advantage over volatile.

Yeah, I think compiler_barrier probably is better too. Well I guess i take 'fence' to mean 'stops observable reordering' which it does for single threads. I was just wondering if single_thread_fence might be more likely to stop people accidentally using them in a multithreaded context.

parched

@jonhoo it all looks good to me now.

jonhoo · 2017-04-08T19:09:40Z

I'm guessing @sfackler is supposed to be the one to merge this?

alexcrichton · 2017-04-08T20:21:13Z

@bors: r+

bors · 2017-04-08T20:21:14Z

📌 Commit f093d59 has been approved by alexcrichton

bors · 2017-04-08T22:37:40Z

⌛ Testing commit f093d59 with merge b2d9b63...

Add safe wrapper for atomic_compilerfence intrinsics This PR adds a proposed safe wrapper for the `atomic_singlethreadfence_*` intrinsics introduced by [RFC #888](rust-lang/rfcs#888). See #41091 for further discussion.

bors · 2017-04-09T01:02:38Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing b2d9b63 to master...

bstrie · 2017-04-12T20:22:33Z

Pasting a concern found elsewhere:

The naming of these seems surprising: the multithreaded functions (and both the single and multithreaded intrinsics themselves) are fences, but this is a barrier. It's not incorrect, but the latter is both inconsistent with the existing functions and slightly confusing with another type in std: https://doc.rust-lang.org/std/sync/struct.Barrier.html. Alternatives would be the "obvious" compiler_fence, or C++'s atomic_signal_fence, or singlethread_fence (or fence_singlethread, etc.).

jonhoo · 2017-04-12T23:02:31Z

I'd be happy to rename it to compiler_fence. Submitted a PR for the rename over at #41262. @parched @tari ?

tari · 2017-04-13T03:33:15Z

I'm not attached to any particular name; @bstrie provides the most compelling case I've seen either way, so the change seems good.

This addresses concerns raised following the merge of rust-lang#41092. Specifically: > The naming of these seems surprising: the multithreaded functions (and > both the single and multithreaded intrinsics themselves) are fences, > but this is a barrier. It's not incorrect, but the latter is both > inconsistent with the existing functions and slightly confusing with > another type in std (e.g., `Barrier`). `compiler_fence` carries the same semantic implication that this is a compiler-only operation, while being more in line with the fence/barrier concepts already in use in `std`.

…lexcrichton Rename compiler_barrier to compiler_fence This addresses concerns raised following the merge of rust-lang#41092. Specifically: > The naming of these seems surprising: the multithreaded functions (and both the single and multithreaded intrinsics themselves) are fences, but this is a barrier. It's not incorrect, but the latter is both inconsistent with the existing functions and slightly confusing with another type in std (e.g., `Barrier`). `compiler_fence` carries the same semantic implication that this is a compiler-only operation, while being more in line with the fence/barrier concepts already in use in `std`.

rust-highfive assigned sfackler Apr 5, 2017

jonhoo mentioned this pull request Apr 5, 2017

Add singlethreaded fence intrinsics. #24833

Merged

Add safe wrapper for atomic_singlethreadfence_*

2598e45

jonhoo force-pushed the std-fence-intrinsics branch from 4deac84 to 2598e45 Compare April 5, 2017 19:47

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Apr 5, 2017

Add unstable book entry

f6d262a

jonhoo force-pushed the std-fence-intrinsics branch from 807f236 to f6d262a Compare April 6, 2017 07:37

Point to tracking issue, not PR

5c6f7fa

parched suggested changes Apr 6, 2017

View reviewed changes

Correct book examples for hardware re-ordering

5202ac5

jonhoo added 2 commits April 7, 2017 09:11

rustdoc needs space after # to ignore

8fc3ab2

Mention interrupts and green threads

e6597e1

parched suggested changes Apr 7, 2017

View reviewed changes

Address @parched's comments

f093d59

parched approved these changes Apr 8, 2017

View reviewed changes

bors merged commit f093d59 into rust-lang:master Apr 9, 2017

jonhoo mentioned this pull request Apr 12, 2017

Rename compiler_barrier to compiler_fence #41262

Merged

Add safe wrapper for atomic_compilerfence intrinsics #41092

Add safe wrapper for atomic_compilerfence intrinsics #41092

Conversation

jonhoo commented Apr 5, 2017 • edited Loading

rust-highfive commented Apr 5, 2017

sfackler commented Apr 5, 2017

jonhoo commented Apr 5, 2017

jonhoo commented Apr 5, 2017 • edited Loading

alexcrichton commented Apr 5, 2017

tari commented Apr 6, 2017

jonhoo commented Apr 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton commented Apr 7, 2017

jonhoo commented Apr 7, 2017

brson commented Apr 7, 2017

jonhoo commented Apr 7, 2017

tari commented Apr 7, 2017

jonhoo commented Apr 7, 2017

steveklabnik commented Apr 7, 2017

jonhoo commented Apr 7, 2017

parched left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonhoo commented Apr 7, 2017

parched commented Apr 7, 2017

parched left a comment

Choose a reason for hiding this comment

jonhoo commented Apr 8, 2017

alexcrichton commented Apr 8, 2017

bors commented Apr 8, 2017

bors commented Apr 8, 2017

bors commented Apr 9, 2017

bstrie commented Apr 12, 2017

jonhoo commented Apr 12, 2017 • edited Loading

tari commented Apr 13, 2017

jonhoo commented Apr 5, 2017 •

edited

Loading

jonhoo commented Apr 5, 2017 •

edited

Loading

jonhoo commented Apr 6, 2017 •

edited

Loading

jonhoo commented Apr 12, 2017 •

edited

Loading