Feature to stop accepting new tasks on existing threads #42709

janrous-rai · 2021-10-19T20:38:07Z

This feature allows operators to stop scheduling of new tasks on specific threads which will make those threads high priority and only handle either explicitly scheduled tasks or tasks that are already running on those high priority threads.

This feature can be used to guarantee responsiveness on certain critical low-latency tasks by isolating them from the bulk of the julia workload that will run on other threads. We have observed that this eliminates most of the scheduling issues and slowdowns due to tasks occupying threads for excessive amounts of time (this is particularly problematic with sticky tasks in 1.6 where single long-running task that gets scheduled on the same thread as critical periodic task can cause major delays).

That said, low-latency is still subject to occasional hiccup as a result of stop-the-world behaviors that happen during garbage collection and compilation.

When running in multi-threaded mode, avoid picking up tasks from shared heaps when you're thread 1 (tid == 0).

When single-threaded, enq_work should place the work in the thread tid=1 workqueue rather than randomly trying to avoid first thread.

Threads will need to call jl_stop_accepting_new_tasks_on_this_thread method in order to stop taking new work from the shared queues. This should be then used by our own code to make it so on the specific thread that runs rai-server and schedules all the background tasks.

tkf · 2021-10-19T21:49:02Z

It'd be nice to have a more compositional API. If a library function internally uses @spawn, calling it on a "non-accepting" worker thread would just make the worker thread idle.

Also, this API cannot be used in a library since there's no way to know that other libraries are calling this function. So, I think it makes more sense to allocate a dedicated thread pool on startup so that the latency-critical part of the application can use it. It'll remove the cost of this feature when not used.

IMHO, I think API like this needs to address the problem of priority inversion as I mentioned in #41586 (comment)

As an aside, I think it's straightforward to avoid a lock if we don't allow making the primary thread to be non-accepting. Also, it seems to me that the current implementation has a data race. Since the flag is read outside the lock, you'd need to use atomic store/load on the flag, even if it's protected by the lock.

vtjnash · 2021-10-21T20:11:54Z

I like the general direction, though need to review in more detail later. I think you can add this flag to the ptls state struct in julia_threads.h.

janrous-rai · 2021-10-28T01:28:19Z

It'd be nice to have a more compositional API. If a library function internally uses @spawn, calling it on a "non-accepting" worker thread would just make the worker thread idle.

Also, this API cannot be used in a library since there's no way to know that other libraries are calling this function. So, I think it makes more sense to allocate a dedicated thread pool on startup so that the latency-critical part of the application can use it. It'll remove the cost of this feature when not used.

Yeah, this is basically intended for setup-once situations where operators of a long-running server will mark certain threads as "high priority" and schedule the latency-sensitive work on them. It should not really be used by libraries because that could cause all sorts of problems.

This is somewhat analogous to having two thread-pools, but less integrated with the built-in scheduler.

We do actually need this kind of behavior and have been running our server with this patch for quite some time now and it did help with a lot of responsiveness issues we've been experiencing prior to this. Ultimately, in a complex codebase where some of the code is user-supplied or generated makes proper cooperation between tasks infeasible (not to mention that there don't seem to be any tools for identifying tasks that are not cooperative, making debugging of these problems intractable).

We might achieve this behavior with threadpools but it seems that the conversation in the PR that implements that feature is hitting some ideological design disagreements so I'm not sure what will come out of it.

I think that the architecturally cleanest solution here would be to have an API that would allow us to spawn a task in a dedicated thread that will only exist while the task is running and will be shut down when the task completes. This way we could spin off lightweight threads running critical work and ensure that they won't be blocked by bulk workloads (those not latency sensitive but potentially large in volume and potentially not yielding). This does imply that we might oversubscribe the physical machine, but this is an acceptable tradeoff and we may still deal with that by scaling down the regular thread count for the bulk workloads.

I have not really thought much about what it would take to implement this thread-dedicated-for-a-task and whether it would integrate cleanly with the rest of the system. In particular, it seems like dynamically adding and removing threads into the system may be quite a challenge.

IMHO, I think API like this needs to address the problem of priority inversion as I mentioned in #41586 (comment)

I'm not sure whether the assumption that all transitive tasks need to stay on the dedicated thread. Our critical tasks are generally pretty simple and they don't do much concurrency so we are probably assuming that this is best-effort and that if something calls @spawn, the low latency is no longer guaranteed.

As a counter-example, we run http server on the high priority thread, ensuring that we can always accept incoming http connections. For long-running user requests, we call @spawn exactly in order to move the bulk processing off of the high priority thread and over to the regular threads.

As an aside, I think it's straightforward to avoid a lock if we don't allow making the primary thread to be non-accepting. Also, it seems to me that the current implementation has a data race. Since the flag is read outside the lock, you'd need to use atomic store/load on the flag, even if it's protected by the lock.

vtjnash · 2021-11-13T15:20:49Z

Why the close?

KristofferC · 2021-11-13T15:26:43Z

This PR is against the backports-release-1.6 branch which was merged and auto-deleted causing this PR to automatically close.

vtjnash · 2021-11-13T17:19:13Z

Ah, it needs to be re-created against master then, since it cannot be reopened.

tkf · 2021-11-13T22:18:51Z

I just created a dummy backports-release-1.6, reopened this PR, changed the PR base, and then removed backports-release-1.6.

NHDaly · 2022-01-02T16:11:30Z

Thanks for managing that, @tkf!

tkf · 2022-01-04T18:35:19Z

(I just realized that my hack made it impossible to review the patch. The diff is easy to see at https://github.com/JuliaLang/julia/compare/1b93d53fc4bb59350ada898038ed4de2994cce33..e5cce258ea722884cb14d018213cef60da48b79c)

some ideological design disagreements

This is a pretty good way to summarize my complaints :)

But, from the practical point of view, I think this PR might be better than #42302 as it's very (and surprisingly) minimal. Maybe that's why Jameson liked the approach.

an API that would allow us to spawn a task in a dedicated thread that will only exist while the task is running and will be shut down when the task completes. This way we could spin off lightweight threads running critical work and ensure that they won't be blocked by bulk workloads

Booting up Julia's runtime environment (ptls) for a thread does not seem like a "lightweight" thing to do (at least intuitively). I'm also not entirely sure if it's easy to tear it down in the middle of the process lifetime. I think it would be "easier" to reuse the thread pool as I just sketched in #43649.

I'm not sure whether the assumption that all transitive tasks need to stay on the dedicated thread. Our critical tasks are generally pretty simple and they don't do much concurrency so we are probably assuming that this is best-effort and that if something calls @spawn, the low latency is no longer guaranteed.

It's easy to set up a Channel{Task} where a consumer in a non-dedicated worker thread schedules the tasks on the non-dedicated worker threads. On the other hand, if we want to receive results from the non-dedicated threads and then do the work based on the result as soon as possible, we'd need to spawn a "waiter" task on the dedicated thread. I prefer this approach also because it let us stick with the structured concurrency paradigm.

To put it in another way, I think we better stick with the "black box rule" (i.e., it's an implementation detail if a function spawns tasks or not) to ensure composability, even with the tasks on dedicated worker threads. If every function call is a potential cause of priority inversion, I think it'd be very hard to use library functions.

vtjnash · 2022-02-07T18:13:06Z

This still needs to be rebased against master, and the flag moved into ptls

NHDaly · 2022-02-18T03:58:49Z

I don't think we have any intention of merging this PR as-is; i think it was opened mainly for discussion purposes, and as an FYI that we are currently applying this patch in our julia builds in production.

As long as the problem is addressed in some way (e.g. by the ongoing thread pools work), i don't think we'd need this PR anymore.

I'm going to mark it DO-NOT-MERGE for clarity.

vtjnash · 2022-02-22T22:51:47Z

Threadpools is a slight generalization of this idea (allowing multiple threads to be prevented from examining the queue), but I think this is still something we can do in the meantime, and likely will be able to continue to support long-term. Other than finishing some nits, I think we can merge (and backport)

tkf · 2022-02-23T06:03:44Z

Threadpools is a slight generalization of this idea (allowing multiple threads to be prevented from examining the queue), but I think this is still something we can do in the meantime, and likely will be able to continue to support long-term.

Hmm... Isn't #42302 close to ready? (I should probably look at the implementation a bit more, though.) I think expanding API surface of the scheduler makes it hard to extend and enable a more flexible approach like #44064. If we are planning to merge #42302, I think it's better to only add #42302 and not in combination with this PR. Or, can we at least put it in Experimental, so that it can be eventually removed once #42302 is in?

janrous-rai added 9 commits July 23, 2021 11:41

Avoid thread 1 when scheduling new non-sticky work.

4ec681e

Protect thread 1 scheduling.

e3856db

When running in multi-threaded mode, avoid picking up tasks from shared heaps when you're thread 1 (tid == 0).

enq_work: add safeguards for single-threaded execution

5a54987

When single-threaded, enq_work should place the work in the thread tid=1 workqueue rather than randomly trying to avoid first thread.

actually modify the scheduling logic to skip over high-prio threads.

e49ab51

Fix the API for controlling the reserved threads.

82f9ce3

Fix fcn signature.

f55ce12

Expose spawn controls via julia code.

e65778e

Fix typos.

e5cce25

janrous-rai mentioned this pull request Oct 19, 2021

Add threadpool support to runtime #42302

Merged

JeffBezanson added the multithreading Base.Threads and related functionality label Oct 20, 2021

KristofferC force-pushed the backports-release-1.6 branch 2 times, most recently from 1a719fb to 94d6407 Compare November 12, 2021 08:49

KristofferC deleted the branch JuliaLang:release-1.6 November 13, 2021 12:09

KristofferC closed this Nov 13, 2021

tkf reopened this Nov 13, 2021

tkf changed the base branch from backports-release-1.6 to master November 13, 2021 22:16

tkf mentioned this pull request Jan 4, 2022

RFC/POC: Multi-scheduler for and in Julia #43649

Closed

NHDaly changed the base branch from master to release-1.6 February 3, 2022 16:24

NHDaly added the DO NOT MERGE Do not merge this PR! label Feb 18, 2022

janrous-rai closed this Oct 29, 2022

DilumAluthge removed the DO NOT MERGE Do not merge this PR! label Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature to stop accepting new tasks on existing threads #42709

Feature to stop accepting new tasks on existing threads #42709

janrous-rai commented Oct 19, 2021

tkf commented Oct 19, 2021

vtjnash commented Oct 21, 2021

janrous-rai commented Oct 28, 2021

vtjnash commented Nov 13, 2021

KristofferC commented Nov 13, 2021 •

edited

Loading

vtjnash commented Nov 13, 2021

tkf commented Nov 13, 2021

NHDaly commented Jan 2, 2022

tkf commented Jan 4, 2022

vtjnash commented Feb 7, 2022

NHDaly commented Feb 18, 2022

vtjnash commented Feb 22, 2022

tkf commented Feb 23, 2022

Feature to stop accepting new tasks on existing threads #42709

Feature to stop accepting new tasks on existing threads #42709

Conversation

janrous-rai commented Oct 19, 2021

tkf commented Oct 19, 2021

vtjnash commented Oct 21, 2021

janrous-rai commented Oct 28, 2021

vtjnash commented Nov 13, 2021

KristofferC commented Nov 13, 2021 • edited Loading

vtjnash commented Nov 13, 2021

tkf commented Nov 13, 2021

NHDaly commented Jan 2, 2022

tkf commented Jan 4, 2022

vtjnash commented Feb 7, 2022

NHDaly commented Feb 18, 2022

vtjnash commented Feb 22, 2022

tkf commented Feb 23, 2022

KristofferC commented Nov 13, 2021 •

edited

Loading