Add `WorkerThreadPool` for running synchronous work in threads #7875

zanieb · 2022-12-13T21:11:12Z

Follow-up to #7865

Adds a worker pool class for submitting synchronous functions to run in background workers. This will eventually replace our usage of anyio's worker threads, giving us greater control over concurrent work and improved debugging.

Workers will be extended in the near future to support sending asynchronous calls back from workers to the event loop thread and sending asynchronous calls to workers running event loops.

This implementation is inspired largely by the CPython ThreadPoolExecutor implementation. Handling for several weird edge cases (mostly related to garbage collection) is replicated here. This implementation will diverge further as we capitalize on asynchronous support.

Example

import asyncio
from prefect._internal.concurrency.workers import WorkerThreadPool

def identity(x):
    return x

async def main():
    pool = WorkerThreadPool()
    future = await pool.submit(identity, 1)
    print(await future.aresult())

asyncio.run(main())

Checklist

This pull request references any related issue by including "closes <link to issue>"
- If no issue exists and your change is not a small fix, please create an issue first.
This pull request includes tests or only affects documentation.
This pull request includes a label categorizing the change e.g. fix, feature, enhancement

netlify · 2022-12-13T21:11:18Z

✅ Deploy Preview for prefect-orion ready!

Name	Link
🔨 Latest commit	`d21e4cb`
🔍 Latest deploy log	https://app.netlify.com/sites/prefect-orion/deploys/6399fc8c5f96ca00094b6351
😎 Deploy Preview	https://deploy-preview-7875--prefect-orion.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

src/prefect/_internal/concurrency/workers.py

+        except BaseException as exc:
+            self.future.set_exception(exc)
+            # Prevent reference cycle in `exc`
+            self = None


src/prefect/_internal/concurrency/workers.py

zanieb · 2022-12-14T16:41:51Z

src/prefect/_internal/concurrency/workers.py

+            pool = WorkerThreadPool(max_workers=1)
+            futures = [await pool.submit(worker.join) for worker in self._workers]
+            await asyncio.gather(*[future.aresult() for future in futures])


This is really meta but it's way easier than spinning a single thread manually just for this purpose.

I was confused for a second but then remembered -- yeah, the ThreadPoolExecutor interface (and thus ours) is way easier to deal with.

zanieb · 2022-12-14T16:43:18Z

src/prefect/_internal/concurrency/workers.py

+
+        Returns a future which can be used to retrieve the result of the function.
+        """
+        async with self._lock:


Technically, because this function never calls await it does not need a lock here. I'm tempted to remove the lock and make a synchronous method, but it feels safer to wait and see what we need in the future.

Hmmm. My instinct would be to remove the lock until we need it, but I'll defer to you here because I'm not sure what keeping it protects us against -- you may have a better idea of that.

I've removed the lock in a following pull request — it turns out I need submit to be synchronous.

I actually had never considered that if your async function doesn't await anything it doesn't need a lock, this was pointed out to me during review of an httpx PR :)

Double checking understanding here - this is because we aren't going to be handing execution to another coroutine without an await, right? So as long as this object is only being accessed by coroutines running within the same thread, and not being accessed by other threads, we don't have to worry about race conditions without the await?

zanieb · 2022-12-14T17:03:16Z

src/prefect/_internal/concurrency/workers.py

@@ -0,0 +1,172 @@
+import asyncio


I am considering renaming this module to threads.py to clear the path for process based workers, but think I will also defer that to the future.

The _internal path allows us the freedom to do that without worrying about compatibility, right?

abrookins

LGTM! Left a few comments, YMMV. 👍

abrookins · 2022-12-14T23:58:27Z

tests/_internal/concurrency/test_workers.py

+        assert len(pool._workers) == pool._max_workers
+
+
+async def test_submit_reuses_idle_thread():


I might preemptively decorate this with the flaky decorator to get retries.

It should never flake in theory :) We can also just sleep for a full second there instead of doing a busy wait, but this is a bit faster. We're just letting Python context switch to call the release method.

abrookins · 2022-12-15T00:04:17Z

tests/_internal/concurrency/test_workers.py

+    pool = WorkerThreadPool()
+    future = await pool.submit(time.sleep, 1)
+    await pool.shutdown()
+    assert await future.aresult() is None


I have a mild concern that aresult() returning None in the successful case leaves room to mask an unknown failure case that would erroneously lead to the same value. I don't feel strongly about it though.

👍 I could add a function that sleeps then returns a value.

abrookins · 2022-12-15T00:06:05Z

tests/_internal/concurrency/test_workers.py

+    assert await future.aresult() is None
+
+
+async def test_shutdown_exception_during_join():


Where does the join happen? I'm a little confused about what this test checks.

Ah pool.shutdown joins all the threads. When called with start, it sends the shutdown to signal to works than kills the threads. Here the pattern is a bit like

--> Test: Shutdown pool --> Pool: Sends signal to workers --> Pool: Awaits on worker join which context switches --> Test: Raises exception --> Pool: Shuts down cleanly still

I wrote this while dealing with some weird issues with pool shutdown when an exception was raised. It's unclear to me how to clarify it / if it's worth keeping.

abrookins · 2022-12-15T00:06:52Z

src/prefect/_internal/concurrency/workers.py

@@ -0,0 +1,172 @@
+import asyncio


The _internal path allows us the freedom to do that without worrying about compatibility, right?

abrookins · 2022-12-15T00:28:25Z

src/prefect/_internal/concurrency/workers.py

+            work_item = self._queue.get()
+            if work_item is None:
+                # Shutdown command received; forward to other workers and exit
+                self._queue.put_nowait(None)


I'm wondering about a slightly different approach here, though I haven't thought it through. But what if WorkerThreadPool handed a shutdown_event (Event) to WorkerThread on init. The worker checks every iteration of its run loop to see if shutdown_event is set, and if it is, run() returns. I'm slightly biased toward this approach -- assuming it actually makes sense -- to avoid attaching a semantic value to None. Up to you though!

Hm, the issue is that self._queue.get() is blocking so the worker will not do anything until it receives something in the queue. We could check an event at the end of each work item but we still need to push something in the queue to wake up all the workers. Both AnyIO's worker threads and the CPython thread pool executor use this model — I trust what they're up to for now :)

Ah, you're right -- I went looking to see how ThreadPoolExecutor handled this and missed its use of None to signal. If it's good enough for them I suppose it's good enough for us! 😂

A bit late but I finally feel up to speed with all of the is. Double checking my understanding from https://github.com/python/cpython/blob/1332fdabbab75bc9e4bced064dc4daab2d7acb47/Lib/asyncio/queues.py#L149

queue.get() will only ever return None after an exception, so we can use it as a signal here?

Also double checking - by blocking here, this keeps us burning through CPU on this while True loop, yeah?

Yes it stops us from burning CPU.

It returns None when we put None in the queue :D — exceptions are raised. We place None in the queue to signal shutdown and to wake up the worker since it is otherwise blocked.

abrookins · 2022-12-15T00:30:45Z

src/prefect/_internal/concurrency/workers.py

+
+        Returns a future which can be used to retrieve the result of the function.
+        """
+        async with self._lock:


Hmmm. My instinct would be to remove the lock until we need it, but I'll defer to you here because I'm not sure what keeping it protects us against -- you may have a better idea of that.

abrookins · 2022-12-15T00:32:55Z

src/prefect/_internal/concurrency/workers.py

+            pool = WorkerThreadPool(max_workers=1)
+            futures = [await pool.submit(worker.join) for worker in self._workers]
+            await asyncio.gather(*[future.aresult() for future in futures])


I was confused for a second but then remembered -- yeah, the ThreadPoolExecutor interface (and thus ours) is way easier to deal with.

abrookins · 2022-12-15T00:35:41Z

src/prefect/_internal/concurrency/event_loop.py

@@ -46,4 +56,23 @@ def wrapper() -> None:
                raise

    __loop.call_soon_threadsafe(wrapper)
-    return future.result()
+    return future


Can't say I understand what's going on here, but I'm ok with that.

These utilities are kind of dumb, but basically AbstractEventLoop.call_soon_threadsafe returns a Handle which is the most useless object around town — you can cancel it and that's it. In cases where we might like, actually want to know what our function returned or wait for its result, we need to return something else. To accomplish this, we wrap the function that we submit to call_soon_threadsafe and use a threading Future to capture the return value.

I've added some additional documentation for these functions in the next PR

anticorrelator · 2022-12-15T04:22:05Z

src/prefect/_internal/concurrency/workers.py

+        except BaseException as exc:
+            self.future.set_exception(exc)
+            # Prevent reference cycle in `exc`
+            self = None


I'm confused about how this works

Thanks for calling this out! This is a common pattern in CPython, but I also have no idea how it works. Let me try to find some resources.

python/cpython#80111 may be helpful?

I think the exception captures frame locals so the future references the exception which references the work item which references the future and we have a cycle. If we set self to None the exception no longer references the work item.

anticorrelator · 2022-12-15T04:36:00Z

src/prefect/_internal/concurrency/workers.py

+            self._queue.put_nowait(work_item)
+
+            # Ensure there are workers available to run the work
+            self._adjust_worker_count()


would it ever be possible for work to be submitted and error before the worker count is adjusted?

I don't think the threads will context switch until after submission completes because control isn't yielded, but even if it did I don't think there would be significant effects.

zanieb · 2022-12-15T15:51:21Z

Thanks for the reviews!

serinamarie · 2022-12-15T17:01:55Z

If we pass in a timeout when calling aresult() in terms of how long we will wait for a task to complete, would it make sense to have a test around a TimeOut error or would that just never happen?

abrookins · 2022-12-15T17:39:40Z

Good question @serinamarie!

zanieb · 2022-12-15T17:51:33Z

In aresult we pass timeout=0 to the underlying implementation because we know the future is done (from waiting for the event) and if it's not done for some reason our assumption is wrong and we should crash. If we didn't set the timeout to 0 and waited instead of crashing, we'd sneakily block the event loop.

We could do something like reach into the internals of the Future and set the done_event without setting a result and we'd get the TimeOut error — but generally I try to avoid reaching into the implementation details while writing tests.

peytonrunyan · 2022-12-20T15:02:09Z

src/prefect/_internal/concurrency/workers.py

+            not self._idle.acquire(blocking=False)
+            and len(self._workers) < self._max_workers
+        ):
+            self._add_worker()


Double checking understanding here. We've got the semaphore set to 0. So we go to submit work, acquire() will give us back False, we check that we have room for additional workers, we spin up a new worker, it does its work, then calls the release() incrementing our semaphore, which lets us know we have idle workers in the pool, yeah?

Add WorkerThreadPool for running synchronous work in threads

0c2ee8e

zanieb added the development Tech debt, refactors, CI, tests, and other related work. label Dec 13, 2022

zanieb requested review from peytonrunyan and serinamarie December 13, 2022 21:13

github-advanced-security bot found potential problems Dec 13, 2022

View reviewed changes

src/prefect/_internal/concurrency/workers.py

except BaseException as exc:

self.future.set_exception(exc)

# Prevent reference cycle in `exc`

self = None

Check notice

Code scanning / CodeQL

Unused local variable

Variable self is not used.

zanieb commented Dec 13, 2022

View reviewed changes

src/prefect/_internal/concurrency/workers.py Outdated Show resolved Hide resolved

zanieb mentioned this pull request Dec 14, 2022

Add asynchronous task support to worker threads #7880

Closed

3 tasks

zanieb added 5 commits December 13, 2022 18:37

Add a context manager for consistent cleanup of pool

74b23a8

Release idle slots _after_ work is complete

081ffc8

Fix idle test

17939f0

Update annotations to avoid breaking on Python < 3.9

3994db8

Improve _adjust_worker_count docstring

d21e4cb

zanieb commented Dec 14, 2022

View reviewed changes

abrookins approved these changes Dec 15, 2022

View reviewed changes

zanieb mentioned this pull request Dec 15, 2022

Improve Event implementation consistency and fix deadlock #7897

Merged

anticorrelator reviewed Dec 15, 2022

View reviewed changes

anticorrelator approved these changes Dec 15, 2022

View reviewed changes

zanieb merged commit 36973f9 into main Dec 15, 2022

zanieb deleted the engine/worker branch December 15, 2022 15:51

zanieb added a commit that referenced this pull request Dec 15, 2022

Add WorkerThreadPool for running synchronous work in threads (#7875)

4624266

zanieb added a commit that referenced this pull request Dec 15, 2022

Add WorkerThreadPool for running synchronous work in threads (#7875)

ab5547d

zanieb mentioned this pull request Dec 16, 2022

Remove worker pool in favor of reusing standard library executors #7931

Merged

3 tasks

peytonrunyan reviewed Dec 20, 2022

View reviewed changes

github-actions bot pushed a commit that referenced this pull request Dec 29, 2022

Add WorkerThreadPool for running synchronous work in threads (#7875)

13d1e79

github-actions bot pushed a commit that referenced this pull request Dec 29, 2022

Add WorkerThreadPool for running synchronous work in threads (#7875)

6ab93c3

github-actions bot pushed a commit that referenced this pull request Jan 4, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

deae0ac

github-actions bot pushed a commit that referenced this pull request Jan 4, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

33f134b

github-actions bot pushed a commit that referenced this pull request Jan 6, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

98e22b9

github-actions bot pushed a commit that referenced this pull request Jan 6, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

9d704dc

github-actions bot pushed a commit that referenced this pull request Jan 12, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

8b7ff05

github-actions bot pushed a commit that referenced this pull request Jan 12, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

0401d1e

github-actions bot pushed a commit that referenced this pull request Jan 19, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

02926dc

github-actions bot pushed a commit that referenced this pull request Jan 19, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

9b8bbbb

github-actions bot pushed a commit that referenced this pull request Jan 26, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

15593fe

github-actions bot pushed a commit that referenced this pull request Jan 26, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

2ddc0ca

zanieb mentioned this pull request Feb 3, 2023

Publish docs from release 2.7.11 #8392

Merged

3 tasks

zanieb added a commit that referenced this pull request Feb 3, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

a887f24

zanieb added a commit that referenced this pull request Feb 3, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

29f628c

zanieb mentioned this pull request Feb 6, 2023

Publish docs from release 2.7.12 #8431

Merged

zanieb added a commit that referenced this pull request Feb 6, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

fd62239

zanieb added a commit that referenced this pull request Feb 6, 2023

Add WorkerThreadPool for running synchronous work in threads (#7875)

263df60

		assert len(pool._workers) == pool._max_workers


		async def test_submit_reuses_idle_thread():

		assert await future.aresult() is None


		async def test_shutdown_exception_during_join():

Add WorkerThreadPool for running synchronous work in threads #7875

Add WorkerThreadPool for running synchronous work in threads #7875

Conversation

zanieb commented Dec 13, 2022 • edited Loading

Example

Checklist

netlify bot commented Dec 13, 2022 • edited Loading

✅ Deploy Preview for prefect-orion ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

peytonrunyan Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abrookins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zanieb Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zanieb commented Dec 15, 2022

serinamarie commented Dec 15, 2022

abrookins commented Dec 15, 2022

zanieb commented Dec 15, 2022

peytonrunyan Dec 20, 2022 • edited Loading

Choose a reason for hiding this comment

Add `WorkerThreadPool` for running synchronous work in threads #7875

Add `WorkerThreadPool` for running synchronous work in threads #7875

zanieb commented Dec 13, 2022 •

edited

Loading

netlify bot commented Dec 13, 2022 •

edited

Loading

peytonrunyan Dec 20, 2022 •

edited

Loading

zanieb Dec 15, 2022 •

edited

Loading

peytonrunyan Dec 20, 2022 •

edited

Loading