Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement <stop_token> and jthread. #1196
Implement <stop_token> and jthread. #1196
Changes from 19 commits
a7e2c93
c8b2db3
f0554d2
a1886a0
8b26e2a
01fc42c
c28e072
f2bf64b
9cf088f
8854f4f
338249d
7aa09ca
7334bae
1a4b5cb
66a14c2
e416bbd
c38d004
b6e6e46
02d9922
d4778ed
0cbc067
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this discarded
void
cast? Did a compiler complain about_Unlock_outer
being unused?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm being paranoid about https://docs.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warning-level-4-c4189?view=vs-2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to take out a lock on the internal mutex before calling
notify_all()
here to avoid a missed wake-up.eg. consider in the
wait()
implementation below that the call to.request_stop()
occurs after the check of_Stoken.stop_requested()
and before the call towait(_Lck)
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notify_all
already takes the internal mutex.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to take the internal mutex before creating _Cb below...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would explain why this was added to cv_any and not plain cv.
Although I think there might be a spec defect here: If the registration for cancel is supposed to be part of the atomic operation that includes unlocking
_Lck
and going to sleep, the spec doesn't say that right now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is broken in the spec. I asked SG1 to comment:
This program can deadlock under the current spec; there is nothing forbidding the following execution:
T1: launches T2
T2: takes
m
T2: [thread.condvarany.intwait]/2 "Registers for the duration of this call
*this
to get notified on a stop request onstoken
during this call", stop is not requested so execution continuesT2:
while (!stoken.stop_requested()) {
T2:
if (pred())
(pred
returnsfalse
)T1:
j.~jthread
T1:
request_stop()
, callscv.notify_all()
T2:
wait(lck)
T1:
join()
Deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on what causes the deadlock in this example?
I'm struggling to see where the deadlock occurs.
AFAICT, if
request_stop()
, and thuscv.notify_all()
, has returned then T1 is no longer holding any resources that T2 is waiting on.You don't want to do that. The callback might execute inline during construction of _Cb, which would then call notify_all(), which then also takes out a lock on the internal mutex. This would be UB to have the calling thread attempt to lock the mutex twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the execution I enumerated, T2 is waiting on the CV, and T1 is waiting on T2, and nothing will ever notify the CV. Note that when T1 tries to
notify_all
, T2 isn't waiting on the CV yet.Correct, T1 is not holding any such resources. T2 isn't waiting on resources. But T2 is waiting to be notified.
I see, we need something more complex :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference implementation indeed does something a bit tricker to ensure it doesn't miss wakeup notifications.
See https://github.com/josuttis/jthread/blob/master/source/condition_variable_any2.hpp#L216-L238
It first locks the internal mutex and then checks
stoken.stop_requested()
.This prevents the missed wakeup if
request_stop()
is called between the evaluation ofpred()
andwait(lck)
.The registered callback will block until
wait()
releases the internal mutex before calling.notify_all()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aggressively trying to not look at the reference implementation because licensing scary.
Right, that's the exact fix. But nothing in the spec requires that right now.