-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify pika::spinlock
and pika::concurrency::detail::spinlock
into one implementation
#672
Conversation
bors try |
Performance test reportpika PerformanceComparison
Info
Explanation of Symbols
|
However:
This is worth some more investigation before we go ahead with this... |
tryBuild failed: |
This is also new: https://cdash.cscs.ch/test/80173581. |
I think the performance regression actually comes from #670, not this PR. I'm going to revert that one instead for the moment. |
d5515cd
to
b74968e
Compare
I rebased this on main after #670 was reverted and the performance test is back to normal. I would go ahead with this. However, there may still be some spinlocks that cover too big sections (there are some timeouts in the tests now). |
bors try |
tryBuild failed: |
bors try |
tryBuild failed: |
I think this is now in a better shape. I still need to rerun benchmarks in DLA-Future but tests seem happier. In the last commit I disabled another place where yielding could happen (in |
737786b
to
0e21aa2
Compare
Rebased after #679 was merged. bors try |
tryBuild failed: |
bors merge |
672: Unify `pika::spinlock` and `pika::concurrency::detail::spinlock` into one implementation r=msimberg a=msimberg Fixes #517, i.e. uses a non-yielding everywhere where a spinlock was used previously. The implementations were almost identical, but used a slightly different yielding strategy. The yielding one (`pika::spinlock`) used `yield_while` allowing yielding of the user-level thread. The non-yielding one (`pika::concurrency::detail::spinlock`) was sleeping the OS-thread after one iteration. This now uses the `pika::spinlock` strategy everywhere, except that yielding is disallowed when spinning. The name `pika::spinlock` is removed and only the `detail` one remains. Note that we used to have _three_ spinlock implementations. Now we still have two. The remaining one is a very basic one with near zero dependencies (no lock registration, no ITT support) that is still used in things like the configuration maps and resource partitioner. This doesn't change performance in either direction in DLA-Future, but it's a prerequisite for having a semi-blocking barrier (eth-cscs/DLA-Future#833). We will need to make corresponding changes in pika-algorithms since it uses `pika::spinlock` in a few places. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
Build failed: |
bors merge |
672: Unify `pika::spinlock` and `pika::concurrency::detail::spinlock` into one implementation r=msimberg a=msimberg Fixes #517, i.e. uses a non-yielding everywhere where a spinlock was used previously. The implementations were almost identical, but used a slightly different yielding strategy. The yielding one (`pika::spinlock`) used `yield_while` allowing yielding of the user-level thread. The non-yielding one (`pika::concurrency::detail::spinlock`) was sleeping the OS-thread after one iteration. This now uses the `pika::spinlock` strategy everywhere, except that yielding is disallowed when spinning. The name `pika::spinlock` is removed and only the `detail` one remains. Note that we used to have _three_ spinlock implementations. Now we still have two. The remaining one is a very basic one with near zero dependencies (no lock registration, no ITT support) that is still used in things like the configuration maps and resource partitioner. This doesn't change performance in either direction in DLA-Future, but it's a prerequisite for having a semi-blocking barrier (eth-cscs/DLA-Future#833). We will need to make corresponding changes in pika-algorithms since it uses `pika::spinlock` in a few places. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
Build failed: |
689: Fix deadlocks in `condition_variable` r=msimberg a=msimberg This fixes deadlocks that started appearing after #672 which disabled yielding for `spinlock`. (One of) the deadlock(s) was the following scenario: | thread 1 | thread 2 | |-|-| | wait_until | | | take lock | | | add self to cv queue | | | release lock | | | timed suspend | | | | notify_all | | | take lock | | timed resume | attempt to set thread 1 to pending | | attempt to take lock | fail because thread 1 is active | | spin trying to take lock | spin waiting for thread 1 to not be active | | deadlock | deadlock | This PR changes `notify_all` to not hold the lock while resuming threads that need to be woken up. I see no reason to keep the lock for that time since there is anyway a delay between setting a thread to `pending` and the thread actually being run by a worker thread, with the latter _not_ happening under a lock already right now. This PR just relaxes that constraint further. It also significantly reduces the time the lock is held in `notify_all`. I'm quite sure this change is safe but we'll need to continue looking out for failures in CI in case I've missed something. I've also reverted the change in `set_thread_state` to never yield from #672. Since the lock in `notify_all` is no longer held while resuming threads it's again safe to yield in `set_thread_state`. I think spurious wakeups were probably possible before this change, but if they weren't they're now definitely possible with pika's `condition_variable`. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
689: Fix deadlocks in `condition_variable` r=msimberg a=msimberg This fixes deadlocks that started appearing after #672 which disabled yielding for `spinlock`. (One of) the deadlock(s) was the following scenario: | thread 1 | thread 2 | |-|-| | wait_until | | | take lock | | | add self to cv queue | | | release lock | | | timed suspend | | | | notify_all | | | take lock | | timed resume | attempt to set thread 1 to pending | | attempt to take lock | fail because thread 1 is active | | spin trying to take lock | spin waiting for thread 1 to not be active | | deadlock | deadlock | This PR changes `notify_all` to not hold the lock while resuming threads that need to be woken up. I see no reason to keep the lock for that time since there is anyway a delay between setting a thread to `pending` and the thread actually being run by a worker thread, with the latter _not_ happening under a lock already right now. This PR just relaxes that constraint further. It also significantly reduces the time the lock is held in `notify_all`. I'm quite sure this change is safe but we'll need to continue looking out for failures in CI in case I've missed something. I've also reverted the change in `set_thread_state` to never yield from #672. Since the lock in `notify_all` is no longer held while resuming threads it's again safe to yield in `set_thread_state`. I think spurious wakeups were probably possible before this change, but if they weren't they're now definitely possible with pika's `condition_variable`. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
689: Fix deadlocks in `condition_variable` r=msimberg a=msimberg This fixes deadlocks that started appearing after #672 which disabled yielding for `spinlock`. (One of) the deadlock(s) was the following scenario: | thread 1 | thread 2 | |-|-| | wait_until | | | take lock | | | add self to cv queue | | | release lock | | | timed suspend | | | | notify_all | | | take lock | | timed resume | attempt to set thread 1 to pending | | attempt to take lock | fail because thread 1 is active | | spin trying to take lock | spin waiting for thread 1 to not be active | | deadlock | deadlock | This PR changes `notify_all` to not hold the lock while resuming threads that need to be woken up. I see no reason to keep the lock for that time since there is anyway a delay between setting a thread to `pending` and the thread actually being run by a worker thread, with the latter _not_ happening under a lock already right now. This PR just relaxes that constraint further. It also significantly reduces the time the lock is held in `notify_all`. I'm quite sure this change is safe but we'll need to continue looking out for failures in CI in case I've missed something. I've also reverted the change in `set_thread_state` to never yield from #672. Since the lock in `notify_all` is no longer held while resuming threads it's again safe to yield in `set_thread_state`. I think spurious wakeups were probably possible before this change, but if they weren't they're now definitely possible with pika's `condition_variable`. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
689: Fix deadlocks in `condition_variable` r=msimberg a=msimberg This fixes deadlocks that started appearing after #672 which disabled yielding for `spinlock`. (One of) the deadlock(s) was the following scenario: | thread 1 | thread 2 | |-|-| | wait_until | | | take lock | | | add self to cv queue | | | release lock | | | timed suspend | | | | notify_all | | | take lock | | timed resume | attempt to set thread 1 to pending | | attempt to take lock | fail because thread 1 is active | | spin trying to take lock | spin waiting for thread 1 to not be active | | deadlock | deadlock | This PR changes `notify_all` to not hold the lock while resuming threads that need to be woken up. I see no reason to keep the lock for that time since there is anyway a delay between setting a thread to `pending` and the thread actually being run by a worker thread, with the latter _not_ happening under a lock already right now. This PR just relaxes that constraint further. It also significantly reduces the time the lock is held in `notify_all`. I'm quite sure this change is safe but we'll need to continue looking out for failures in CI in case I've missed something. I've also reverted the change in `set_thread_state` to never yield from #672. Since the lock in `notify_all` is no longer held while resuming threads it's again safe to yield in `set_thread_state`. I think spurious wakeups were probably possible before this change, but if they weren't they're now definitely possible with pika's `condition_variable`. Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
Fixes #517, i.e. uses a non-yielding everywhere where a spinlock was used previously. The implementations were almost identical, but used a slightly different yielding strategy. The yielding one (
pika::spinlock
) usedyield_while
allowing yielding of the user-level thread. The non-yielding one (pika::concurrency::detail::spinlock
) was sleeping the OS-thread after one iteration. This now uses thepika::spinlock
strategy everywhere, except that yielding is disallowed when spinning. The namepika::spinlock
is removed and only thedetail
one remains.Note that we used to have three spinlock implementations. Now we still have two. The remaining one is a very basic one with near zero dependencies (no lock registration, no ITT support) that is still used in things like the configuration maps and resource partitioner.
This doesn't change performance in either direction in DLA-Future, but it's a prerequisite for having a semi-blocking barrier (eth-cscs/DLA-Future#833).
We will need to make corresponding changes in pika-algorithms since it uses
pika::spinlock
in a few places.