-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle spurious wake ups in BarrierSemaphore #82
Conversation
@@ -127,21 +134,29 @@ class BarrierSemaphore { | |||
void unsafe_reset(uint32 semCount) { | |||
int prev = counter_; | |||
counter_ = semCount; | |||
if (prev < 0) { semCond_.notify_all(); } | |||
if (prev < 0) { | |||
notify_ = wait_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the mutex be unlocked before notification?
@rkaminsk Thanks for the PR. Looks like the
This could wake up Thread 1, which would now return from the Anyhow, now that I have looked at this again, I think the whole class is unnecessary and misguided. So, instead of fixing I drafted something here: https://github.com/potassco/clasp/tree/issue-80 Could you maybe check whether this works for you? |
I only have a general idea how the parallel solving works. I don't know what the points are when synchronization requests are happening. My assumptions was that spurious wake ups were not handled correctly. This sounds rather like a logical error. It did not happen in my tests.
It runs fine in my virtual machine with your patch. |
I'm sorry if my answer sounded like criticism. It was not meant like this. Your investigation is spot on and greatly appreciated as it would probably have taken me another couple of weeks to dig deep enough. I'm also sure that your suggested fix would work in this context. It's just that your investigation made me realize that the existing I'll close this PR now but thanks again for solving the actual issue 👍 |
No worries. No criticism or upset feelings. I just did not expect a logical error from you (meant as high praise). But then parallel code is super hard. 😉 Thanks for taking the time to fix this! |
This PR updates the
BarrierSemaphore
class to correctly handle spurious wake ups of the wait function of the condition variable. One could combine the two counters into one but this probably does not matter performance wise because there is just one object per clasp instance.TODO: I have seen that there are further usages of conditional variables in the code which might exhibit the same problem. I'ld could have a look to but would like to know what you think about this patch first.