Should our locks (and semaphores, queues, etc.) be fair? #54

njsmith · 2017-02-14T10:53:50Z

Currently, trio._sync and trio._core._parking_lot go to some effort to make our synchronization primitives fair, in the sense that e.g. when a lock is released it always goes to the next task who requested it in FIFO order. This is attractive because it avoids starvation issues where one task just can't get a lock. The most obvious example would be when you have two tasks running a loop like

while True:
    async with some_lock:
        ...

where they keep releasing the lock and then immediately reaquiring it. Currently, the two tasks will alternate round-robin. If we allow barging, then in the naive implementation the thread that just released the lock will deterministically succeed in re-acquiring it, so the lock will never change hands. Of course this isn't the most representative example, but it illustrates the general idea.

Nonetheless, the conventional wisdom is that fairness is actually bad (!!):

Why Windows made their locks unfair: http://joeduffyblog.com/2006/12/14/anticonvoy-locks-in-windows-server-2003-sp1-and-windows-vista/
This issue on the golang tracker has a great discussion of both why they made their locks unfair (with reference to Doug Lea's experiments with java) and also why that has really bitten them sometimes: runtime: fall back to fair locks after repeated sleep-acquire failures golang/go#13086
- Patch that reduced the worst case unfairness of go locks: https://go-review.googlesource.com/#/c/34310/ (the rust parking_lot package uses a similar strategy)
The webkit locking post, which has an extensive discussion of why they allow barging: https://webkit.org/blog/6161/locking-in-webkit/

It's not clear to me whether the conventional arguments apply here, though. AFAICT the biggest reason why fairness is bad is that it prevents the very-common situation where locks are acquired and then released within a single scheduling quantum, without contention. In this case barging is great; in a uniprocessor system where a lock can always be acquired/release without pre-empting in between, then barging entirely eliminates contention. Wow! And these kinds of micro-critical-sections that just protect a few quick operations are very common in traditional preemptive code. But... for us, micro-critical-sections often don't need locks at all! So this case that may dominate the traditional analysis basically doesn't exist for us. Hmm. Also, our context switching has very different characteristics than theirs, in terms of relative costs of different operations. I don't feel like I have a very clear sense of how this plays out in practice, though -- pypy tracing might change things a lot, it might matter a lot whether the scheduler gets smart enough to start eliding yield_brieflys (and if so, what policy it uses to do so), etc. And another important consideration is that we can potentially have deep integration between our locking primitives and our scheduler, which might let as avoid some of the specific pathologies that often hit user-level locking primitives trying to interact with a kernel scheduler. (In particular, this comment talks about how we might be able to define fairness such that the task that we hand off to is always the next task to be run.)

The answer here may also vary for different primitives. In particular, fairness for locks and fairness for queues seems rather different. I guess for queues fairness is useful if you're multiplexing a set of inputs with backpressure into put, or using get to multiplex a set of outputs into different channels, because it ends up being fair across those input/output channels?

The text was updated successfully, but these errors were encountered:

njsmith · 2017-03-20T01:32:43Z

Noticed while experimenting with TLS support: an interesting use case for fair mutexes is when multiple tasks are sharing a single connection – code like

data = some_sansio_state_object.send(...)
async with connection_lock:
    await connection.sendall(data)

is correct (sends the data in the correct order) iff connection_lock is fair.

Of course, in this case we could just as well have written it as

async with connection_lock:
    data = some_sansio_state_object.send(...)
    await connection.sendall(data)

but if you have a protocol where you can unexpectedly find yourself with data to send (e.g. doing a receive on a protocol that has pings, like websockets or http/2, or TLS 1.2 and earlier with their renegotiation stuff), and don't want to hold the send mutex unnecessarily (e.g. b/c you're trying to handle the receive end of the connection!), then this seems like a plausible (albeit rather tricky!) approach:

some_sansio_state_object.do_something()
if some_sansio_state_object.has_data():
    # pull it out of the shared state before yielding
    data = some_sansio_state_object.get_data()
    async with connection_lock:
        await connection.sendall(data)

njsmith · 2017-03-20T01:37:51Z

Note that the above requires strict FIFO fairness, which is incompatible with doing clever things with letting higher priority tasks skip ahead, as suggested in this comment (but some sort of priority inheritance thing could work I guess).

Though I guess for this particular use case and WFQ-style priorities, there's an argument for lack-of-priority inheritance, i.e. just making the slow task block the others that are sharing the same connection, as a kind of connection-wise group scheduling.

njsmith · 2017-04-28T04:45:13Z

The SSL thing makes me nervous: if people start relying on Lock providing strict FIFO fairness, then it might mean we get stuck and can't change it – and possibly can't change our scheduling rules at all either. I'm totally happy for it to guarantee a kind of vague "eventual fairness" where what we're really guaranteeing is that code using locks will provide OK latencies, but maybe we should add a StrictFIFOLock or something (that for now would just be an alias for Lock) that is used to document the cases that need something stronger, and keep our options open for later?

(And in particular, if we do implement WFQ then we'd probably want the implementations of StrictFIFOLock and Lock to diverge, with Lock guaranteeing fairness in the sense that it always lets the longest-waiting immediately runnable task take the lock, and StrictFIFOLock doing either a naive FIFO or FIFO+some kind of WFQ-priority-inheritance.)

See python-trio#54 (comment)

mxrch · 2021-02-10T22:31:05Z

@njsmith I think the locks being fair (FIFO order) is a great thing, especially when using Trio with libs like httpx, to avoid a task with a request being timed out.
(yeah this issue is 4 years old but I saw it in the docs)

dj-foxxy · 2024-05-04T20:30:30Z

I was unaware Locks are fair and now can delete a lot of code. I really like the being the default. I'd suggest creating UnfairLock rather than making Lock unfair and adding FairLock.

njsmith added the design discussion label Feb 14, 2017

njsmith added the potential API breaker label Apr 28, 2017

njsmith added a commit to njsmith/trio that referenced this issue Apr 29, 2017

Add StrictFIFOLock as an alias for Lock

f08220b

See python-trio#54 (comment)

njsmith added a commit to njsmith/trio that referenced this issue May 1, 2017

Add StrictFIFOLock as an alias for Lock

455783a

See python-trio#54 (comment)

njsmith mentioned this issue Jul 10, 2017

Discussion: what's the best strategy for "nowait" and "select" operations? #242

Open

njsmith mentioned this issue Sep 22, 2017

Document why SendStream.send_all doesn't automatically use a Lock to serialize calls #328

Open

oremanj added the performance label May 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should our locks (and semaphores, queues, etc.) be fair? #54

Should our locks (and semaphores, queues, etc.) be fair? #54

njsmith commented Feb 14, 2017 •

edited

Loading

njsmith commented Mar 20, 2017 •

edited

Loading

njsmith commented Mar 20, 2017 •

edited

Loading

njsmith commented Apr 28, 2017

mxrch commented Feb 10, 2021 •

edited

Loading

dj-foxxy commented May 4, 2024 •

edited

Loading

Should our locks (and semaphores, queues, etc.) be fair? #54

Should our locks (and semaphores, queues, etc.) be fair? #54

Comments

njsmith commented Feb 14, 2017 • edited Loading

njsmith commented Mar 20, 2017 • edited Loading

njsmith commented Mar 20, 2017 • edited Loading

njsmith commented Apr 28, 2017

mxrch commented Feb 10, 2021 • edited Loading

dj-foxxy commented May 4, 2024 • edited Loading

njsmith commented Feb 14, 2017 •

edited

Loading

njsmith commented Mar 20, 2017 •

edited

Loading

njsmith commented Mar 20, 2017 •

edited

Loading

mxrch commented Feb 10, 2021 •

edited

Loading

dj-foxxy commented May 4, 2024 •

edited

Loading