sync::Once use release-acquire access modes #52349

RalfJung · 2018-07-13T16:10:06Z

Nothing here makes a case distinction like "this happened before OR after that". All we need is to get happens-before edges whenever we see that the state/signal has been changed. Release-acquire is good enough for that.

rust-highfive · 2018-07-13T16:10:17Z

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

nagisa · 2018-07-13T21:48:07Z

src/libstd/sync/once.rs

@@ -323,7 +323,7 @@ impl Once {
                POISONED |
                INCOMPLETE => {
                    let old = self.state.compare_and_swap(state, RUNNING,
-                                                          Ordering::SeqCst);
+                                                          Ordering::AcqRel);


These could use compare_exchange and a Ordering::Relaxed for the failure ordering, I believe. Do you have a way to verify/check that?

I could try to bribe Hai or @jhjourdan into proving it. ;) But that's likely not going to happen soon. Personally I will not use Relaxed anywhere without being really sure.

gereeter · 2018-07-14T18:53:49Z

cc #27610 and previous attempts #31650 and #44331.

The conclusions there were to keep everything SeqCst since SeqCst loads (all that occur on the fast path) happen to be fast on most architectures and it is more likely to be correct. It may be more palatable now, however, in the face of the weak memory RustBelt work.

I personally strongly disagree with this decision. I think that Acquire/Release are more readable since they can just be ignored if necessary for intuition and act as a guide for a more formal proof of correctness. I find the safety gain of SeqCst meaningless because ensuring correctness requires careful reasoning, which is generally best done using transfers of ownership effected by Acquire/Release pairs, not by considering possible interleavings. Basically, I find @RalfJung's comments here and here spot on.

sfackler · 2018-07-14T19:40:21Z

r? @alexcrichton

RalfJung · 2018-07-15T00:07:11Z

It may be more palatable now, however, in the face of the weak memory RustBelt work.

Notice that I am not directly involved in this work, and they have not looked at sync::Once (yet).

alexcrichton · 2018-07-15T17:11:50Z

FWIW I'm at least personally still a novice with atomic orderings which (as mentioned by @gereeter) in previous threads is why I've proposed for leaving this as SeqCst. When dealing with concurrent code I'm at least personally trained to think about interleavings, and that to me at least does seem like the least common denominator when working with concurrent code. As @gereeter also mentioned, this I think is a reason to use SeqCst as aggressively as possible. You simply can't go wrong if you're already doing the baseling work of considering interleavings.

@RalfJung you mentioned on Reddit that SeqCst is more complicated than acquire/release, but that's along the lines of "proving you need SeqCst", right? For just using them, though, do you agree that SeqCst is the "safest choice"?

One question I'd have about this patch as-is is that it seems to differ from #44331 in a few orderings. Specifically in Drop for Finish this PR uses AcqRel while #44331 uses Release. Do we know which is correct? Was Release incorrect?

I completely agree that SeqCst is overly powerful in cases like this. This is why previously I've asked for benchmarks or information about how this provides a faster primitive. If it's significant winnings then I definitely think it's worth the tradeoff of harder-to-understand code. In particular with the case of Once there's only really one atomic ordering that matters which is the initial load as all the others are on the slow path.

Now if we literally have an academic proof for the orderings being correct I'm all for switching to those orderings! Before that, though, I'm personally hesitant to do so without motivation in terms of benchmarks. I certainly trust @RalfJung to get concurrent orderings more correct than myself though!

nagisa · 2018-07-15T17:16:35Z

Specifically in Drop for Finish this PR uses AcqRel while #44331 uses Release. Do we know which is correct? Was Release incorrect?

AcqRel will pick either Acquire or Release whichever is appropriate for the operation (load or store, respectively). So both changes are equivalent in that particular snippet of code.

RalfJung · 2018-07-15T18:47:26Z

I just noticed Jehoon did that previous PR. He certainly knows more about weak memory than I do.^^

AcqRel will pick either Acquire or Release whichever is appropriate for the operation (load or store, respectively). So both changes are equivalent in that particular snippet of code.

I don't think that is correct, e.g. AcqRel is invalid for loads:

rust/src/libcore/sync/atomic.rs

Line 1727 in 7751885

AcqRel => panic!("there is no such thing as an acquire/release load"),

AcqRel is used for read-modify-write operations to make the read acquire, and the write release. That other PR uses a weaker mode that just makes the write release, but makes the read relaxed.

However, I wish what you said was true because then one could just use AcqRel for all operations and would not have to worry about these details...

nagisa · 2018-07-15T18:51:48Z

Oh, I was looking at the wrong op (the store instead of the swap). So yeah, I’m wrong about the equivalency here.

Yeah… AcqRel on regular stores and loads would be great.

RalfJung · 2018-07-15T18:52:55Z

Yeah… AcqRel on regular stores and loads would be great.

Would that need an RFC or is a PR enough? It'd be insta-stable... so I'd guess it needs an RFC.

nagisa · 2018-07-15T18:56:48Z

Since both C++ and LLVM do not permit it, there might be a good reason for that. I would be somewhat reluctant to land such change with just a PR.

RalfJung · 2018-07-16T10:26:14Z

@alexcrichton

@RalfJung you mentioned on Reddit that SeqCst is more complicated than acquire/release, but that's along the lines of "proving you need SeqCst", right? For just using them, though, do you agree that SeqCst is the "safest choice"?

Certainly, whenever release-acquire is correct, then SeqCst is also correct. So yes, it is the "safest choice".

Specifically in Drop for Finish this PR uses AcqRel while #44331 uses Release. Do we know which is correct? Was Release incorrect?

I trust @jeehoonkang got this right. Release here means that in the following code, we cannot assume to actually see writes that happened before whatever write the swap reads from.

In particular with the case of Once there's only really one atomic ordering that matters which is the initial load as all the others are on the slow path.

Shall I change this PR to only make that an Acquire, and add a comment that we're just not bothering with the rest?

Now if we literally have an academic proof for the orderings being correct I'm all for switching to those orderings! Before that, though, I'm personally hesitant to do so without motivation in terms of benchmarks.

Fair enough. Maybe we can get @jhjourdan or Hai (don't know his GitHub handle) to do a proof. ;)

jeehoonkang · 2018-07-16T11:33:10Z

I think there are two different schools in concurrent programming: what I call the "bullet-proof" school and the "logical proof" school.

In the bullet-proof school, people know that concurrent programming is hard, and in order to make it right, they don't optimize things when not 100% sure. The meaning of "100%" may vary among people: someone thinks a lot of tests; another thinks no fine-grained concurrency; maybe SeqCst-only fine-grained concurrency; and even sometimes 100% means "never." So different projects adopt aggressive, fine-grained concurrency in different degrees, depending on the stability and performance requirement of the projects.

On the other hand, in the logical proof school, people know that concurrent programming is hard, but thanks to the recent advances of reasoning principles, it is no longer a beast. Iris and LambdaRust from @drdreyer's group (where @RalfJung is!) are notable examples of such reasoning principles, on which you can prove why certain fine-grained concurrency patterns are correct. Once an implementation is proved, we can believe the result and fearlessly deploy it.

I'm personally an advocate for the logical proof school, and I believe so is @RalfJung. That's why we're constantly proposing to lower orderings in Rust std: e.g. we believe we "know" the logical proof of Once, and it relies on release/acquire synchronization, not the interleaving nature of SeqCst. So using release/acquire is not only more performant (at least in some hardware) but also is conveying the designer's intent more clearly. Of course you may disagree with us if you don't believe the proof; anyway, we're talking about a very core foundation library of a real-world language, and we cannot be too cautious!

Probably this disagreement comes from the fact that the logical proof school has not finished its job yet: while we at academia have produced quite solid results so far, based on which I'm quite confident that using release/acquire is safe in sync::Once in particular, we still don't have a good concurrency semantics for C/C++; and we still don't have a good reasoning principle (e.g. program logic) that scales up to real-world C/C++ data structures (I know there are some unpublished results, but.. :) I don't want to ask the Rust community to be an early adopter of those results, and that's why I agreed to close #44331. And unfortunately, I think the situation has not changed so far. But thanks to the recent progress, I think we are almost "getting there" 😄

RalfJung · 2018-07-16T12:53:10Z

Well put.

I was not aware that there had been previous proposals to change the synchronization mode when I opened this PR -- so it seems likely to at least be a good idea to add a comment that prevents more people from repeating the same thing in the future. ;)

alexcrichton · 2018-07-16T14:55:41Z

Indeed well put! And to be clear, I'd be totally comfortable with a proof. @RalfJung's work and all have already found multiple bugs in libstd through proofs, so I have a good deal of confidence in them! I'm under the impression though that one doesn't exist for Once yet, though, which is why I'm hesitant to change this here.

@jeehoonkang heh I think I would definitely classify myself in the "bullet-proof" school in that case! I definitely do agree that using release/acquire can convey intent more clearly, but my worry is that I will convey myself incorrectly because I don't fully understand release/acquire. In that sense I know I can't miscommuniate if I always use SeqCst, I just communicate too much :)

@RalfJung I'd be totally down with some comments in the code about "usage of SeqCst is way too strong here, pending more work to change to release/acquire". Additionally if y'all feel that Acquire is indeed 100% correct for the first initial load I would be ok changing that. Note that, tested long ago, there is no difference in codegen between Acquire and SeqCst on either x86, arm, or aarch64. Only on powerpc was it found to generate a difference. This is definitely my biggest hesitation in the sense that in my mind we're trading "only worry about interleavings" for "worry about interleavings and atomic orderings, and it'll be faster on platforms that don't see much benchmarking anyway". (in that the possible gains for the increase in difficulty seem small as measured so far)

hanna-kruppe · 2018-07-16T15:09:50Z

Note that, tested long ago, there is no difference in codegen between Acquire and SeqCst on either x86, arm, or aarch64.

Release stores, however, do have different codegen even on x86. From my understanding, this is unlikely to matter for Once since there's no stores on the hot path, but it's still useful to keep in mind. No architecture of interest provides SC for free, not even those with the relatively strong TSO model.

RalfJung · 2018-07-17T18:22:21Z

I'm under the impression though that one doesn't exist for Once yet, though, which is why I'm hesitant to change this here.

Correct, there is no proof for Once.

I have sense then browser some more placed in libstd and seen that SeqCst is indeed used consistently and not necessary in most cases. Even Arc contains some SeqCst that probably nobody bothered to optimize. That makes me wonder whether Once is the right place for such a comment. OTOH, given that this is the third PR doing the same thing, that seems like an indication. ;)

Note that, tested long ago, there is no difference in codegen between Acquire and SeqCst on either x86, arm, or aarch64. Only on powerpc was it found to generate a difference.

I am really surprised that there is no codegen difference for the ARM architectures -- but indeed, this website lists "ldr; dmb ish" as codegen for both SeqCst and Acquire loads. (There's an alternative for acquire loads, but it seems LLVM does not consider that worth it.)

drdreyer · 2018-07-17T18:35:48Z

Correct, there is no proof for Once.

Actually, Hai and I talked about it today. We will see if it is possible to do a proof for Once without too much effort. The main technical issue has to do with the fact that there are comparisons of pointer equality, which means we have to show that any pointer we could possibly compare for equality has not been deallocated. Should be doable, but may be tricky. I don't think it has anything to do with weak memory.

RalfJung · 2018-07-17T18:44:40Z

we have to show that any pointer we could possibly compare for equality has not been deallocated

So it is not enough to rely on the non-determinism in case the pointer has been deallocated?

… it elsewhere

RalfJung · 2018-07-17T18:53:01Z

Meanwhile, I have reduced this PR to just do Acquire on the hot path and explain why we don't bother for the rest of the code.

drdreyer · 2018-07-17T18:53:39Z

So it is not enough to rely on the non-determinism in case the pointer has been deallocated?

Well, the problem is that if you do a CAS on a location that might be storing a deallocated pointer, then it might succeed even though the things being compared are not equal, and I don't know how you would reason about that. So I think you need to arrange for the location's invariant to store some knowledge that its contents are not a deallocated pointer.

alexcrichton · 2018-07-18T15:24:55Z

@bors: r+

bors · 2018-07-18T15:24:57Z

📌 Commit 3e1254d has been approved by alexcrichton

RalfJung · 2018-07-19T11:01:32Z

@nagisa

Yeah… AcqRel on regular stores and loads would be great.

RFC submitted: rust-lang/rfcs#2503

bors · 2018-07-20T02:52:26Z

⌛ Testing commit 3e1254d with merge bc14d71...

sync::Once use release-acquire access modes Nothing here makes a case distinction like "this happened before OR after that". All we need is to get happens-before edges whenever we see that the state/signal has been changed. Release-acquire is good enough for that.

bors · 2018-07-20T05:01:14Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing bc14d71 to master...

rust-highfive assigned sfackler Jul 13, 2018

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 13, 2018

nagisa reviewed Jul 13, 2018

View reviewed changes

rust-highfive assigned alexcrichton and unassigned sfackler Jul 14, 2018

sync::Once: Use Acquire on the hot path, and explain why we don't use…

3e1254d

… it elsewhere

RalfJung force-pushed the once branch from 9a221c7 to 3e1254d Compare July 17, 2018 18:51

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 18, 2018

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jul 18, 2018

RalfJung mentioned this pull request Jul 19, 2018

Make AcqRel universally usable as ordering mode rust-lang/rfcs#2503

Closed

bors merged commit 3e1254d into rust-lang:master Jul 20, 2018

RalfJung deleted the once branch July 23, 2018 09:16

RalfJung mentioned this pull request Aug 15, 2018

Rewrite the park/unpark mechanism tokio-rs/tokio#528

Merged

Mark-Simulacrum mentioned this pull request Dec 7, 2019

std::sync::Once has way more barriers than necessary #27610

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync::Once use release-acquire access modes #52349

sync::Once use release-acquire access modes #52349

RalfJung commented Jul 13, 2018

rust-highfive commented Jul 13, 2018

nagisa Jul 13, 2018 •

edited

Loading

RalfJung Jul 13, 2018 •

edited

Loading

gereeter commented Jul 14, 2018 •

edited

Loading

sfackler commented Jul 14, 2018

RalfJung commented Jul 15, 2018

alexcrichton commented Jul 15, 2018

nagisa commented Jul 15, 2018

RalfJung commented Jul 15, 2018 •

edited

Loading

nagisa commented Jul 15, 2018

RalfJung commented Jul 15, 2018 •

edited

Loading

nagisa commented Jul 15, 2018

RalfJung commented Jul 16, 2018

jeehoonkang commented Jul 16, 2018 •

edited

Loading

RalfJung commented Jul 16, 2018 •

edited

Loading

alexcrichton commented Jul 16, 2018

hanna-kruppe commented Jul 16, 2018

RalfJung commented Jul 17, 2018

drdreyer commented Jul 17, 2018

RalfJung commented Jul 17, 2018

RalfJung commented Jul 17, 2018

drdreyer commented Jul 17, 2018

alexcrichton commented Jul 18, 2018

bors commented Jul 18, 2018

RalfJung commented Jul 19, 2018

bors commented Jul 20, 2018

bors commented Jul 20, 2018

sync::Once use release-acquire access modes #52349

sync::Once use release-acquire access modes #52349

Conversation

RalfJung commented Jul 13, 2018

rust-highfive commented Jul 13, 2018

nagisa Jul 13, 2018 • edited Loading

Choose a reason for hiding this comment

RalfJung Jul 13, 2018 • edited Loading

Choose a reason for hiding this comment

gereeter commented Jul 14, 2018 • edited Loading

sfackler commented Jul 14, 2018

RalfJung commented Jul 15, 2018

alexcrichton commented Jul 15, 2018

nagisa commented Jul 15, 2018

RalfJung commented Jul 15, 2018 • edited Loading

nagisa commented Jul 15, 2018

RalfJung commented Jul 15, 2018 • edited Loading

nagisa commented Jul 15, 2018

RalfJung commented Jul 16, 2018

jeehoonkang commented Jul 16, 2018 • edited Loading

RalfJung commented Jul 16, 2018 • edited Loading

alexcrichton commented Jul 16, 2018

hanna-kruppe commented Jul 16, 2018

RalfJung commented Jul 17, 2018

drdreyer commented Jul 17, 2018

RalfJung commented Jul 17, 2018

RalfJung commented Jul 17, 2018

drdreyer commented Jul 17, 2018

alexcrichton commented Jul 18, 2018

bors commented Jul 18, 2018

RalfJung commented Jul 19, 2018

bors commented Jul 20, 2018

bors commented Jul 20, 2018

nagisa Jul 13, 2018 •

edited

Loading

RalfJung Jul 13, 2018 •

edited

Loading

gereeter commented Jul 14, 2018 •

edited

Loading

RalfJung commented Jul 15, 2018 •

edited

Loading

RalfJung commented Jul 15, 2018 •

edited

Loading

jeehoonkang commented Jul 16, 2018 •

edited

Loading

RalfJung commented Jul 16, 2018 •

edited

Loading