Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid memory ordering in thread park / unpark #53366

Closed
carllerche opened this issue Aug 14, 2018 · 29 comments
Closed

Invalid memory ordering in thread park / unpark #53366

carllerche opened this issue Aug 14, 2018 · 29 comments

Comments

@carllerche
Copy link
Member

I'm pretty sure that the memory orderings uses in the thread park / unpark are incorrect. They happen to work on x86 due to the generated ASM being stronger than the spec, but it will doubtfully work on other architectures.

There are a few locations where a store is used with SeqCst assuming that this would "acquire", but this is not the case. For example here

I discovered this issue when a PR was provided to copy the Rust code into Tokio.

See the discussion here

@RalfJung
Copy link
Member

I think that particular store is fine. it doesn't need to "acquire".

What really matters is that we have read a NOTIFY with "acquire" semantics, and that happened in the preceeding compare_exchange.

@carllerche
Copy link
Member Author

it doesn't need to "acquire".

Could you explain why you think this is true? The park / unpark API specifically needs to provide acq / rel semantics to the user of the API. When a call to park returns, it must have acquired all writes that happened before calls to unpark that resulted in the unpark.

Specifically, that line in park must synchronize with this line in unpark:

match self.inner.state.compare_exchange(EMPTY, NOTIFIED, SeqCst, SeqCst) {

That would

@RalfJung
Copy link
Member

RalfJung commented Aug 15, 2018

So the point is specifically the case where this CAS fails because the value is already NOTIFIED?

            match self.inner.state.compare_exchange(EMPTY, NOTIFIED, SeqCst, SeqCst) {
                Ok(_) => return, // no one was waiting
                Err(NOTIFIED) => return, // already unparked <--
                Err(PARKED) => {} // gotta go wake someone up
                _ => panic!("inconsistent state in unpark"),
            }

That is interesting indeed. However, notice that if that CAS failed, it performed just a read. So adding an extra read in park will not change anything; only a read reading from a write would make a difference.

However, it is possible that here SeqCst actually makes the difference. I will have to read up on how exactly it affects happens-before.

@RalfJung
Copy link
Member

RalfJung commented Aug 15, 2018

Looking at the formulation in http://plv.mpi-sws.org/scfix/paper.pdf, SC accesses have some extra requirements for their ordering relative to each other but do not seem give extra synchronization otherwise.

However, I am no longer sure what you think the problem is. The line you pointed at, if it actually performs a write, will write the NOTIFIED which is later read by the thread.inner.state.compare_exchange(EMPTY, PARKED) in park. So there is a synchronization happening between the thread causing the unpark, and the waking-up thread.


However, the line I pointed at above is somewhat interesting -- and your comments in the other discussion sounds like that is what you are worried about:

No, because a third thread can come along and unpark while the park fn is between 257 and 262. The bug is specifically that the park thread must acquire the memory from a third thread that calls unpark before the park thread hits line 262.

The only situation I can imagine where there is a difference is something like the following:

  • Thread 0 parks itself.
  • Thread 1 calls unpark. The call completes, state is NOTIFIED.
  • Thread 2 calls unpark -- this is a NOP because the state is already NOTIFIED. The call completes.
  • Thread 0 notices this, wakes up, proceeds to it's CAS and synchronizes with thread 1. State is EMPTY.

Now thread 0 never synchronized with thread 2. However, notice that thread 0 is also never waiting for thread 2 -- so, it is equally likely that the execution proceeds as follows:

  • Thread 0 parks itself.
  • Thread 1 calls unpark. The call completes.
  • Thread 2 calls unpark, but gets descheduled immediately.
  • Thread 0 notices this, wakes up, proceeds to it's CAS and synchronizes with thread 1.
  • Now thread 2 continues and completes its unpark, which sets the state back to NOTIFIED and then returns.

So, there is some kind of race here when a thread wakes up while a redundant unpark happens -- depending on the outcome of the race, the thread has a token remaining (state NOTIFIED) or not (state EMPTY). Is that the desired behavior?
(There is yet another trace where first thread 0 does the CAS to EMPTY, then thread 2 does the CAS to NOTIFIED and finally thread 0 writes EMPTY again, eating a token. But this doesn't add any more possible behaviors, I think.)

But anyway no amount of additional ordering can fix this, this is a more fundamental point in the algorithm: How to handle redundant unparks? If these are allowed, we inherently have a race because whether an unpark is redundant or not is racy. And as long as a redundantly unparking thread does not perform any writes (or barriers), there is no way it could Release any of its accesses to the formerly parked thread.

@parched
Copy link
Contributor

parched commented Aug 16, 2018

Correct me if I'm wrong, but if there a 2 threads doing the unparking, then there's no way the parked thread can know which thread unparked it. So, regardless if every call to unpark synchronized with parked thread, there still needs to be another form of synchronization to tell which thread which thread called unpark. That is, the synchonization here is only needed when there is one unparker, which it currently works fine for?

@comex
Copy link
Contributor

comex commented Aug 17, 2018

There's still a potential race. If Thread 2 reads NOTIFIED, it knows that Thread 0 will unpark "in the future"; however, without sequential consistency, that doesn't mean Thread 2's prior writes will necessarily be visible to Thread 0 when it does so. If Thread 0 then goes to sleep again, it may never see Thread 2's writes unless someone else happens to unpark it.

However, as far as I know, using SeqCst here is incorrect, as it doesn't guarantee any ordering with respect to other accesses that are not SeqCst.

It does work if the other accesses happen to be SeqCst: that is, if Thread 2 does a SeqCst write to some other variable before calling unpark, and Thread 0 does a corresponding SeqCst read after returning from park, then the use of SeqCst for the compare-exchange should guarantee that Thread 0 reads what Thread 2 wrote. This is something that wouldn't work with just AcqRel.

But the desired semantics for park/unpark are stronger. When Thread 2 calls unpark, it's supposed to guarantee that Thread 0 will 'eventually' return from park and at that point be able to see Thread 2's prior writes, even with a relaxed or acquire load. That guarantee is upheld in the 'successful CAS from EMPTY to NOTIFIED' case, but not the 'unsuccessful CAS because it was NOTIFIED already' case.

@parched
Copy link
Contributor

parched commented Aug 17, 2018

But the desired semantics for park/unpark are stronger. When Thread 2 calls unpark, it's supposed to guarantee that Thread 0 will 'eventually' return from park and at that point be able to see Thread 2's prior writes.

Ok, I see the desire but I'm having trouble understanding how that could be useful. Do you have a concrete example where it's needed?

@comex
Copy link
Contributor

comex commented Aug 17, 2018

For example, Thread 2 pushes something onto a lock-free queue, then unparks Thread 0; Thread 0 is running an infinite loop that alternates between calling park() and checking the queue, removing any pending items.

If the memory model is weak enough, I think it should be needed for almost any use of park/unpark. Even when a mutex is involved, it seems that mutexes don't necessarily guarantee sequential consistency (!) … So, for example, even if Thread 2 locks the mutex to write some data, and Thread 0 locks the mutex before reading it, in theory Thread 0 could acquire the mutex before Thread 2 and not see the data, even if Thread 2, after unlocking the mutex, then observes Thread 0 being in NOTIFIED state and thus supposedly not having locked it yet!

@parched
Copy link
Contributor

parched commented Aug 17, 2018

But the queue would have to be designed so that writer synchronizes with the reader, otherwise you would have a race when Thread 0 was unparked by Thread 1 and then starts reading the queue while Thread 2 is writing to it?

@RalfJung
Copy link
Member

But the desired semantics for park/unpark are stronger. When Thread 2 calls unpark, it's supposed to guarantee that Thread 0 will 'eventually' return from park and at that point be able to see Thread 2's prior writes, even with a relaxed or acquire load. That guarantee is upheld in the 'successful CAS from EMPTY to NOTIFIED' case, but not the 'unsuccessful CAS because it was NOTIFIED already' case.

I agree on the analysis of the problem. Not sure about the desired API. ;)

However, this can only be a problem in cases where a thread calls unpark without knowing that the other thread is even parked. That seems like a mis-use of the parking API to me? Isn't there generally one dedicated thread responsible for waking up the parked threads, e.g. the thread that released the Mutex, or the thread that triggered the barrier, or the thread that completed initializing a sync::Once?

@carllerche
Copy link
Member Author

Of course you still need the lock free queue to be synchronized. The issue is that the park thread needs to unpark after unpark is called, or it could unpark before it sees the work done on the lock free queue, thus missing the wakeup.

@RalfJung
Copy link
Member

@carllerche So what is an example of a data structure that speculatively calls unpark on a thread that may or may not be parked right now, and relies on synchronization? I believe that is the only case where the issue we are discussing here actually arises.

@carllerche
Copy link
Member Author

carllerche commented Aug 17, 2018

I don't understand your question. Are you asking if, given two threads A, B, where A may be blocked in park at some point, does it make sense for B to call unpark without knowing if A is currently blocked?

@RalfJung
Copy link
Member

Essentially, yes. And also I am asking if you agree with my analysis above that this is the only case where the issue we are discussing here surfaces.

@carllerche
Copy link
Member Author

@RalfJung that behavior is exactly the roll of park / unpark. That is why park will not block if unpark was called before park was called. This behavior is relied on heavily in Tokio. This pattern has been lifted from Java and, again, the behavior is the basis for how all the concurrent data structures are built.

@comex
Copy link
Contributor

comex commented Aug 18, 2018

It doesn't help that the documentation for park gives a bad metaphor:

In other words, each Thread acts a bit like a spinlock that can be locked and unlocked using park and unpark.

First of all, it's not a spinlock; it properly blocks the current thread in the OS rather than spinning. So it would be better to say "mutex".

But it doesn't really have the semantics of any kind of lock. It's closest to an auto-resetting event, except that it allows for spurious wakeups. It's also similar to a condition variable, and indeed the implementation uses a condition variable. However, like an event but unlike a condition variable, it guarantees that if the target thread is currently awake (not parked) when someone unparks it, the thread will immediately return from its next park.

For the record, just as an explanation of how events can be used like condition variables, as a "please wake up and check this other data" signal:

This avoids the need for a mutex. A condition variable must be combined with an accompanying mutex to be used correctly: if the thread is already awake, signaling it will have no effect, but it could be awake yet just about to go back to sleep, having already checked the other data – perhaps even having already started executing, e.g., pthread_cond_wait. The only way to be notified when it's actually considered asleep is by acquiring the mutex, which is atomically released as part of the process of waiting on a condition variable. With an event, the situation is sort of flipped. Signaling an event always guarantees that the thread will wake up in the future, so it can never cause a deadlock – no need to use a mutex. However, if you signal a thread which is already awake, yet currently prior to checking the other data, then it might see your writes when it checks, then call wait – which thanks to your signal will immediately return, causing an unnecessary second run through the loop. But from the target thread's perspective this is similar to a spurious wakeup, and not a big deal. You just have to ensure that whatever you do to "check the other data" is idempotent.

@RalfJung
Copy link
Member

RalfJung commented Aug 18, 2018

@carllerche I see. It would still be interesting to have a concrete simplified example, I find it hard to imagine in the abstract.

Anyway, next question: Assuming thread A "speculatively" wakes up thread B, i.e. A doesn't actually know if B is parked. How can anything ever exploit the fact that this synchronizes? There is an inherent race here, and it is always possible that A is already completely unparked before A gets to do anything. In that case, no synchronization can possibly happen. I believe this is a possibility in every case where the missing ordering you are talking about could be relevant -- and hence I believe that adding a stronger ordering here cannot actually ever reliably incur stronger synchronization.

IOW, you are saying the problem is something like the following situation:

A begins unparking (triggered by some third thread)
B unparks A
A completes unparking
B returns from `unpark`

The issue, if I understand it correctly, is a missing happens-before between A and B in this case.

However, in every such situation, we always also have the following possible behavior:

A begins unparking
A completes unparking
B unparks A
B returns from `unpark`

In this case, clearly there is no happens-before between A and B. So if a missing happens-before is a problem in the first situation, then the algorithm is buggy even if we fix this problem, because the second situation remains a possibility. Therefore, this is not even actually a problem.

I do not see any way in which adding a happens-before in the first situation could possibly fix anything. If you are still convinced that there is a problem, I think it would be extremely helpful for this discussion if you could sketch a concrete user of park/unpark, and describe how adding this synchronization edge you are asking for helps that user. The discussion in the abstract is clearly not leading us anywhere.


This pattern has been lifted from Java and, again, the behavior is the basis for how all the concurrent data structures are built.

Certainly not "all"; structures like sync::Once know that a thread is parked when they call unpark so they are fine. Sure, some other thread could randomly call unpark. However, the clause saying that spurious wakeups are possible means that data structures anyway seed a side-channel to ensure a wake-up is not spurious, and can incur the happens-before edge via that side-channel. (I think I now understand why that clause is there, despite the code explicitly not doing spurious wakeups! That might be worth documenting.)

It would be interesting to see what Java does here in terms of memory orderings. But so far, I am unconvinced that the missing happens-before edge here can ever be the cause of a problem.

@comex
Copy link
Contributor

comex commented Aug 18, 2018

@RalfJung If Thread B is already 'completely unparked' when Thread A calls unpark, then indeed it may not see Thread A's writes in its current go-around, but its next call to park will return immediately. Assuming Thread B is in an infinite loop of alternately calling park and checking a data structure, it will then see the writes in its next go-around.

@comex
Copy link
Contributor

comex commented Aug 18, 2018

To clarify – the failure case is not generally that Thread B reads some sort of inconsistent data, but rather that it doesn't notice Thread A's writes at all, assumes the work isn't done yet, and goes back to sleep waiting for it; but Thread A has already called unpark, so there's nothing to wake it up. In other words, a deadlock.

edit: I guess it doesn’t technically meet the definition of a deadlock; I just mean that it gets stuck :)

@parched
Copy link
Contributor

parched commented Aug 21, 2018

the failure case is not generally that Thread B reads some sort of inconsistent data, but rather that it doesn't notice Thread A's writes at all, assumes the work isn't done yet, and goes back to sleep waiting for it;

Thanks, I see the problem now, so something like

use std::sync::atomic::{AtomicBool, Ordering};
use std::thread::{current, spawn, park};

static FLAG: AtomicBool = AtomicBool::new(false);

fn main() {
    let thread_0 = current();
    let _thread_1 = spawn(move || {
        FLAG.store(true, Ordering::Relaxed);
        thread_0.unpark();
    });
    
    let thread_0 = current();
    let _thread_2 = spawn(move || {
        thread_0.unpark();
    });
    
    loop {
        park();
        if FLAG.load(Ordering::Relaxed) {
            return;
        }
    }
}

should be guaranteed to exit, but it currently isn't?

Would this work, or am I forgetting something?
park:

if state.cas(NOTIFIED, EMPTY):
    return
state = PARKED
with mutex:
    while not state.cas(NOTIFIED, EMPTY):
        wait

unpark:

if state.swap(NOTIFIED) != PARKED:
    return
with mutex:
    notify

@parched
Copy link
Contributor

parched commented Aug 21, 2018

@comex Actually I spotted a bug, I think it should be
park:

if state.cas(NOTIFIED, EMPTY): // optional early out
    return
if not state.cas(EMPTY, PARKED):
    state.swap(EMPTY)
    return
with mutex:
    while not state.cas(NOTIFIED, EMPTY):
        wait

@comex
Copy link
Contributor

comex commented Aug 21, 2018

@parched

Thanks, I see the problem now, so something like [..] should be guaranteed to exit, but it currently isn't?

Yep, that looks right.

Would this work, or am I forgetting something?

I'm actually not sure!

In general, the redundant unpark case will look something like:

(1) Thread 2 unparks Thread 1, swaps from PARKED to NOTIFIED and proceeds to wake up Thread 1
(2) Thread 3 unparks Thread 1, swaps from NOTIFIED to NOTIFIED
(3) Thread 1, having woken up, swaps from NOTIFIED to EMPTY.

(If the swap in (2) happens earlier, it's the same scenario but swapping "Thread 2" and "Thread 3". If it happens later, then the state stays NOTIFIED and Thread 1 will wake up next time it parks.)

Any individual atomic variable is guaranteed to have a total order of modifications. And clearly, the write in (2) must happen before the write in (3). Thread 1 is guaranteed to see Thread 3's writes if the write in (2) happens before the read in (3). So the only problem is if the write in (2) occurs in between the read in (3) and the write in (3). Since (3) is doing an atomic swap, one might expect it to be impossible that anything could come between the read and the write. However, I am not completely sure whether that's the case when the write doesn't actually modify the value (because the variable already had that value). I need to do some spec spelunking.

Separately, your original version has a bug where state = PARKED could clobber it previously being set to NOTIFIED; I haven't checked whether the version from your second post properly fixes that.

@comex
Copy link
Contributor

comex commented Aug 21, 2018

On second thought (and after reviewing cppreference's summary of modification order a bit more), I'm pretty sure it is sound with respect to that issue.

On the other hand, it seems a bit suboptimal that your revised version has park do two CASes in a row. It might be better to start with the existing implementation and make unpark do an extra swap only in the "already unparked" branch. Or maybe there's a way to improve your version.

I suppose this extra swap could be a swap-to-same-value like in your example. If you want to be paranoid about the issue I was worried about, there could be multiple different values that correspond to NOTIFIED, and it could swap from one to the other. However, this is probably unnecessary.

@parched
Copy link
Contributor

parched commented Aug 21, 2018

On the other hand, it seems a bit suboptimal that your revised version has park do two CASes in a row. It might be better to start with the existing implementation and make unpark do an extra swap only in the "already unparked" branch.

Looking at the revised version of park now I realise it's basically just the original version with the swap added like @stjepang did in his PR, the only difference being locking the mutex later. So, assuming the usual case of unpark not being called at the same time as park, there's 1 CAS if notified and 2 when empty (swap never called). If you removed the early out you have 2 CASes when notified (the explicit CAS and the swap) and 1 when empty but I guess it's better not to optimise for this case because it's going to sleep anyway.

@parched
Copy link
Contributor

parched commented Aug 21, 2018

the only difference being locking the mutex later

And doing an extra CAS before wait which is only needed because the mutex is locked later. So @stjepang's version of park is probably best.

@willmo
Copy link
Contributor

willmo commented Aug 22, 2018

It looks like the troublesome store was added just recently in #51290 to avoid spurious wakeups, but those are supposed to be allowed anyway.

@parched
Copy link
Contributor

parched commented Sep 13, 2018

I've created a PR to fix this here and I would appreciate a code review if anyone has the chance, thanks.

@RalfJung
Copy link
Member

should be guaranteed to exit, but it currently isn't?

Thanks a lot for providing this example, that helps a lot. I think I understand the concern now. I will meditate on this a bit more to understand the effect of #51290 and #54174 and then comment in @parched's PR.

@RalfJung
Copy link
Member

It looks like the troublesome store was added just recently in #51290 to avoid spurious wakeups, but those are supposed to be allowed anyway.

I think the example above was broken already before that PR. Namely, if the second spawned thread does unpark first (writing NOTIFIED), then the first thread does unpark and returns immediately because it already was notified, and then the main thread wakes up -- there is no release write that ever happened in the thread that wrote the flag, so there is no guarantee anyone will ever see that flag.

bors added a commit that referenced this issue Sep 19, 2018
Fix `thread` `park`/`unpark` synchronization

Previously the code below would not be guaranteed to exit when the
second unpark took the `return, // already unparked` path because there
was no write to synchronize with a read in `park`.

EDIT: doesn't actually require third thread
```
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread::{current, spawn, park};

static FLAG: AtomicBool = AtomicBool::new(false);

fn main() {
    let thread_0 = current();
    spawn(move || {
        thread_0.unpark();
        FLAG.store(true, Ordering::Relaxed);
        thread_0.unpark();
    });

    while !FLAG.load(Ordering::Relaxed) {
        park();
    }
}
```

I have some other ideas on how to improve the performance of `park` and `unpark` using fences, avoiding any atomic RMW when the state is already `NOTIFIED`, and also how to avoid calling `notify_one` without the mutex locked. But I need to write some micro benchmarks first, so I'll submit those changes at a later date if they prove to be faster.

Fixes #53366 I hope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants