Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

less runner: actually allow concurrent predictions and refactor runner #1499

Closed
wants to merge 23 commits into from

Conversation

technillogue
Copy link
Contributor

@technillogue technillogue commented Jan 26, 2024

  • add concurrency to config
  • this basically works!
  • more descriptive names for predict functions
  • maybe pass through prediction id and try to make cancelation do both?
  • don't cancel from signal handler if a loop is running. expose worker busy state to runner
  • move handle_event_stream to PredictionEventHandler
  • make setup and canceling work

@technillogue technillogue changed the base branch from main to async January 26, 2024 21:55
@technillogue technillogue marked this pull request as draft January 26, 2024 21:55
@technillogue technillogue force-pushed the syl/kill-runner branch 2 times, most recently from 8c494a8 to 0b1ad74 Compare January 29, 2024 18:50
@technillogue technillogue changed the title less runner less runner: actually allow concurrent predictions and refactor runner Jan 29, 2024
@technillogue technillogue changed the base branch from async to syl/mux January 30, 2024 06:02
@yorickvP yorickvP added the async label Feb 8, 2024
Base automatically changed from syl/mux to async February 12, 2024 21:09
* have runner return asyncio.Task instead of AsyncFuture
* make tests async and fix them
* delete remaining runner thread code :)
* review changes to tests and server

(reverts commit 828eee9)
this is the first step towards supporting continuous batching and concurrent predictions. eventually, you will be configure it so your predict function will be called concurrently

* bare minimum to support async predict
* add async tests

Signed-off-by: technillogue <technillogue@gmail.com>
technillogue and others added 5 commits February 13, 2024 02:38
* conditionally create the event loop if predictor is async, and add a path for hypothetical async setup
* don't use async for predict loop if predict is not async
* add test cases for shared loop across setup and predict + asyncio.run in setup

(reverts commit b533c6b)
* async Worker._wait and its consequences
* AsyncPipe so that we can process idempotent endpoint and cancellation rather than _wait blocking the event loop
* test_prediction_cancel can be flaky on some machines
* separate _process_list to be less surprising than isasyncgen
* sleep wasn't needed
* suggestions from review
* suggestions from review
* even more suggestions from review

---------

Signed-off-by: technillogue <technillogue@gmail.com>
Co-authored-by: Nick Stenning <nick@whiteink.com>
* race utility for racing awaitables
* start mux, tag events with id, read pipe in a task, get events from mux
* use async pipe for async child loop
* _shutting_down vs _terminating
* race with shutdown event
* keep reading events during shutdown, but call terminate after the last Done
* emit heartbeats from mux.read
* don't use _wait. instead, setup reads event from the mux too
* worker semaphore and prediction ctx
* where _wait used to raise a fatal error, have _read_events set an error on Mux, and then Mux.read can raise the error in the right context. otherwise, the exception is stuck in a task and doesn't propagate correctly
* fix event loop errors for <3.9
* keep track of predictions in flight explicitly and use that to route logs
* don't wait for executor shutdown
* progress: check for cancelation in task done_handler
* let mux check if child is alive and set mux shutdown after leaving read event loop
* close pipe when exiting
* predict requires IDLE or PROCESSING
* try adding a BUSY state distinct from PROCESSING when we no longer have capacity
* move resetting events to setup() instead of _read_events()

previously this was in _read_events because it's a coroutine that will have the correct event loop. however, _read_events actually gets created in a task, which can run *after* the first mux.read call by setup. since setup is now the first async entrypoint in worker and in tests, we can safely move it there

* state_from_predictions_in_flight instead of checking the value of semaphore
* make prediction_ctx "private"

Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
…busy state to runner

Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
…point return the same result and fix tests somewhat

Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Signed-off-by: technillogue <technillogue@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants