-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SpinLock Improvements #415
SpinLock Improvements #415
Conversation
…and be single-CPU friendly(ish)
Codecov Report
@@ Coverage Diff @@
## master #415 +/- ##
==========================================
- Coverage 94.70% 94.59% -0.11%
==========================================
Files 179 179
Lines 7666 7677 +11
==========================================
+ Hits 7260 7262 +2
- Misses 406 415 +9
|
} | ||
std::this_thread::yield(); | ||
} | ||
// Sleep and then start the whole process again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is an unconditional loop anyway, wondering why not do that in the last Spin-Yield instead of retrying Spin-Fast? Is there some benefit to do Spin-Fast after Spin-Yield timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that (theoretically) spin fast is on the order of 10s of ns, while yield is 100s of NS and the full sleep is 1ms, i.e. each is an order of magnitude slower. The goal is to "catch" the lock when you're scheduled and no need to give up to another thread if you only need to wait a few ns, not a big deal.
This theory could be wrong and we should validate with a benchmark the behavior we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks for coming up with these changes in improving coherent cache misses, cpu contention and better performance on single core. We may need benchmarking to come up with right values for the loop constants, and/or backoff logic ( exponential or incremental ) in subsequent iteration.
Yes, I'd like to get a set of benchmarks going. Have you seen the recent update to the specification around benchmarks? |
Not really, the only spec document I know around benchmark is the one written by @ThomsonTan |
Level-up the spinlock implementation.
std::atomic
instead ofstd::atomic_flag
for the corestd::current_thread::yield()
in a loop (single-core cpu friendly)try_lock
method (as previously requested) that does NOT do any spinning.A few caveats in the current implementation:
std::
functions as I could to keep this simple to evaluate