Add documentation for the combining lock #683

mjp41 · 2024-10-02T16:55:10Z

This adds some documentation to make the combining lock easier to understand. This is working towards documenting the changes for the 0.7 release.

SchrodingerZhu · 2024-10-02T16:59:28Z

I happen to have another related documented code in case it helps:

https://github.com/llvm/llvm-project/pull/101916/files

P.S. MCS queue is not the correct name for MCS lock. I will need to correct that.

This adds some documentation to make the combining lock easier to understand. This is working towards documenting the changes for the 0.7 release.

mjp41 · 2024-10-02T20:47:59Z

I happen to have another related documented code in case it helps:

https://github.com/llvm/llvm-project/pull/101916/files

P.S. MCS queue is not the correct name for MCS lock. I will need to correct that.

Thanks for the link. If you have any suggestions for making this clearer please do add them here. I'm thinking of refactoring the code a little bit more.

mjp41 · 2024-10-02T21:02:23Z

@SchrodingerZhu @vishalgupta97 I'd be really keen to get your feedback on this PR. Does the markdown file make sense?

SchrodingerZhu · 2024-10-02T21:45:36Z

Yes. I like the detailed document with performance data.

One difference in my LLVM-libc patch was that if I grab the lock at the first place, I will try to call the templated lambda before spilling anything explicitly on stack. I think you have also refactored it in the same way in the latest commit.

vishalgupta97 · 2024-10-02T21:44:28Z

docs/combininglock.md

+The core idea of the combining lock is that it holds a queue of operations that need to be executed while holding the lock,
+and then the thread holding the lock can execute multiple threads' operations in a single lock acquisition.
+This can reduce the amount of cache misses as a single thread performs multiple updates,
+and thus overall the system will have fewer cache misses.


Cache misses for shared data will go down, but there still be cache misses for bringing in the queue node (containing information about what operation to execute (function pointer)). Overall, the system might not have fewer cache misses.

vishalgupta97 · 2024-10-02T22:01:08Z

docs/combininglock.md

+
+* Back off strategies
+* Integration with Futex/Waitonaddress
+


For queue: A->B->C
Before executing B, a prefetch for C's queue node can be issued, so that when the time comes to execute C, some of the latency can be hidden.

vishalgupta97 · 2024-10-02T22:18:53Z

src/snmalloc/ds/combininglock.h

+
+      // Notify the thread that we completed its work.
+      // Note that this needs to be done last, as we can't read
+      // curr->next after setting curr->status


Once the load (curr->next) is done, Line 159 and Line 164 can be reordered.

SchrodingerZhu · 2024-10-03T02:51:17Z

docs/combininglock.md

+This can be bad for performance, however, with the MCS queue lock the threads spin on individual cache lines so some of the worst effects of [spinning are mitigated](https://pdos.csail.mit.edu/papers/linux:lock.pdf).
+However, for a more general purpose the combining lock should have a back off strategy to increase the time between checks the longer it waits.
+
+The second improvement would be to integrate the combining lock with the OS level support for waiting.


What Rust's synchronization routines are generally using is to spin 100 (a somehow chosen number) times before going into sleep, similar to the following:

int remaining_spins = SPIN_COUNT; // Do spin polling for certain amount of time. while (remaining_spins > 0) { if (header.status.load(cpp::MemoryOrder::RELAXED) != RawLambdaLockHeader::WAITING) break; sleep_briefly(); remaining_spins--; } // If we used up all spins, we may need to go to sleep. if (remaining_spins == 0) { FutexWordType expected = RawLambdaLockHeader::WAITING; if (header.status.compare_exchange_strong( expected, RawLambdaLockHeader::SLEEPING, cpp::MemoryOrder::ACQ_REL, cpp::MemoryOrder::RELAXED)) header.status.wait(RawLambdaLockHeader::SLEEPING); }

However, that was for polling shared words. Not sure if exponential backoff or other schemes would become more applicable in MCS lock.

mjp41 force-pushed the combininglockdoc branch from 03487ef to 385a5ef Compare October 2, 2024 16:59

Add documentation for the combining lock

0211bb7

This adds some documentation to make the combining lock easier to understand. This is working towards documenting the changes for the 0.7 release.

mjp41 force-pushed the combininglockdoc branch from 385a5ef to 0211bb7 Compare October 2, 2024 20:45

Small refactoring

f8604b0

Clangformat

d536de4

vishalgupta97 reviewed Oct 2, 2024

View reviewed changes

SchrodingerZhu reviewed Oct 3, 2024

View reviewed changes

CR

81e0434

vishalgupta97 approved these changes Oct 4, 2024

View reviewed changes

SchrodingerZhu approved these changes Oct 4, 2024

View reviewed changes

mjp41 merged commit c770769 into microsoft:main Oct 5, 2024
52 checks passed

mjp41 deleted the combininglockdoc branch November 19, 2024 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation for the combining lock #683

Add documentation for the combining lock #683

mjp41 commented Oct 2, 2024

SchrodingerZhu commented Oct 2, 2024 •

edited

Loading

mjp41 commented Oct 2, 2024

mjp41 commented Oct 2, 2024

SchrodingerZhu commented Oct 2, 2024

vishalgupta97 Oct 2, 2024

vishalgupta97 Oct 2, 2024

vishalgupta97 Oct 2, 2024

SchrodingerZhu Oct 3, 2024 •

edited

Loading

Add documentation for the combining lock #683

Add documentation for the combining lock #683

Conversation

mjp41 commented Oct 2, 2024

SchrodingerZhu commented Oct 2, 2024 • edited Loading

mjp41 commented Oct 2, 2024

mjp41 commented Oct 2, 2024

SchrodingerZhu commented Oct 2, 2024

vishalgupta97 Oct 2, 2024

Choose a reason for hiding this comment

vishalgupta97 Oct 2, 2024

Choose a reason for hiding this comment

vishalgupta97 Oct 2, 2024

Choose a reason for hiding this comment

SchrodingerZhu Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

SchrodingerZhu commented Oct 2, 2024 •

edited

Loading

SchrodingerZhu Oct 3, 2024 •

edited

Loading