Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async release channel #1512

Draft
wants to merge 52 commits into
base: main
Choose a base branch
from
Draft

async release channel #1512

wants to merge 52 commits into from

Commits on Jan 22, 2024

  1. run CI for this branch the same way as for main

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    0df9b82 View commit details
    Browse the repository at this point in the history

Commits on Feb 21, 2024

  1. async runner (#1352)

    * have runner return asyncio.Task instead of AsyncFuture
    * make tests async and fix them
    * delete remaining runner thread code :)
    * review changes to tests and server
    
    (reverts commit 828eee9)
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    2eb3b48 View commit details
    Browse the repository at this point in the history
  2. support async predict functions (#1350)

    this is the first step towards supporting continuous batching and concurrent predictions. eventually, you will be configure it so your predict function will be called concurrently
    
    * bare minimum to support async predict
    * add async tests
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    d5e6f73 View commit details
    Browse the repository at this point in the history
  3. create event loop before predictor setup (#1366)

    * conditionally create the event loop if predictor is async, and add a path for hypothetical async setup
    * don't use async for predict loop if predict is not async
    * add test cases for shared loop across setup and predict + asyncio.run in setup
    
    (reverts commit b533c6b)
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    8f3fe08 View commit details
    Browse the repository at this point in the history
  4. lints

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    ff754f3 View commit details
    Browse the repository at this point in the history
  5. minimal async worker (#1410)

    * async Worker._wait and its consequences
    * AsyncPipe so that we can process idempotent endpoint and cancellation rather than _wait blocking the event loop
    * test_prediction_cancel can be flaky on some machines
    * separate _process_list to be less surprising than isasyncgen
    * sleep wasn't needed
    * suggestions from review
    * suggestions from review
    * even more suggestions from review
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    Co-authored-by: Nick Stenning <nick@whiteink.com>
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue and nickstenning committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    3112380 View commit details
    Browse the repository at this point in the history
  6. format

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    3792202 View commit details
    Browse the repository at this point in the history
  7. Mux prediction events (#1405)

    * race utility for racing awaitables
    * start mux, tag events with id, read pipe in a task, get events from mux
    * use async pipe for async child loop
    * _shutting_down vs _terminating
    * race with shutdown event
    * keep reading events during shutdown, but call terminate after the last Done
    * emit heartbeats from mux.read
    * don't use _wait. instead, setup reads event from the mux too
    * worker semaphore and prediction ctx
    * where _wait used to raise a fatal error, have _read_events set an error on Mux, and then Mux.read can raise the error in the right context. otherwise, the exception is stuck in a task and doesn't propagate correctly
    * fix event loop errors for <3.9
    * keep track of predictions in flight explicitly and use that to route logs
    * don't wait for executor shutdown
    * progress: check for cancelation in task done_handler
    * let mux check if child is alive and set mux shutdown after leaving read event loop
    * close pipe when exiting
    * predict requires IDLE or PROCESSING
    * try adding a BUSY state distinct from PROCESSING when we no longer have capacity
    * move resetting events to setup() instead of _read_events()
    
    previously this was in _read_events because it's a coroutine that will have the correct event loop. however, _read_events actually gets created in a task, which can run *after* the first mux.read call by setup. since setup is now the first async entrypoint in worker and in tests, we can safely move it there
    
    * state_from_predictions_in_flight instead of checking the value of semaphore
    * make prediction_ctx "private"
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    ce18964 View commit details
    Browse the repository at this point in the history
  8. Scary temporary commit for a hemorrhaging-edge release

    * add concurrency to config
    * this basically works!
    * more descriptive names for predict functions
    * maybe pass through prediction id and try to make cancelation do both?
    * don't cancel from signal handler if a loop is running. expose worker busy state to runner
    * move handle_event_stream to PredictionEventHandler
    * make setup and canceling work
    * drop some checks around cancelation
    * try out eager_predict_state_change
    * keep track of multiple runner prediction tasks to make idempotent endpoint return the same result and fix tests somewhat
    * fix idempotent tests
    * fix remaining errors?
    * worker predict_generator shouldn't be eager
    * wip: make the stuff that handles events and sends webhooks etc async
    * drop Runner._result
    * drop comments
    * inline client code
    * get started
    * inline webhooks
    * move clients into runner, switch to httpx, move create_event_handler into runner
    * add some comments
    * more notes
    * rip out webhooks and most of files and put them in a new ClientManager that handles most of everything. inline upload_files for that
    * move create_event_handler into PredictionEventHandler.__init__
    * fix one test
    * break out Path.validate into value_to_path and inline get_filename and File.validate
    * split out URLPath into BackwardsCompatibleDataURLTempFilePath and URLThatCanBeConvertedToPath with the download part of URLFile inlined
    * let's make DataURLTempFilePath also use convert and move value_to_path back to Path.validate
    * use httpx for downloading input urls and follow redirects
    * take get_filename back out for tests
    * don't upload in http and delete cog/files.py
    * drop should_cancel
    * prediction->request
    * split up predict/inner/prediction_ctx into enter_predict/exit_predict/prediction_ctx/inner_async_predict/predict/good_predict as one way to do it. however, exposing all of those for runner predict enter/coro exit still sucks, but this is still an improvement
    * bigish change: inline predict_and_handle_errors
    * inline make_error_handler into setup
    * move runner.setup into runner.Runner.setup
    * add concurrency to config in go
    * try explicitly using prediction_ctx __enter__ and __exit__
    * make runner setup more correct and marginally better
    * fix a few tests
    * notes
    * wip ClientManager.convert
    * relax setup argument requirement to str
    * glom worker into runner
    * add logging message
    * fix prediction retry and improve logging
    * split out handle_event
    * use CURL_CA_BUNDLE for file upload
    * clean up comments
    * dubious upload fix
    * small fixes
    * attempt to add context logging?
    * tweak names
    * fix error for predictionOutputType(multi=False)
    * improve comments
    * fix lints
    * skip worker and webhook tests since those were erroring on removed imports. fix or xfail runner tests
    * upload in http instead of PredictionEventHandler. this makes tests pass and fixes some problems with validation, but also prevents streaming files and causes new problems. also xfail all the responses tests that need to be replaced with respx
    * format
    * fix some new-style type signatures and drop 3.8 support
    * drop 3.7 in code
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    335f67b View commit details
    Browse the repository at this point in the history
  9. Revert "Scary temporary commit for a hemorrhaging-edge release"

    This reverts commit 335f67b.
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Feb 21, 2024
    Configuration menu
    Copy the full SHA
    1af5eda View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2024

  1. replace requests with httpx and factor out clients (#1574)

    * input downloads, output uploads, and webhooks are now handled by ClientManager, which persists for the lifetime of runner, allowing us to reuse connections, which may significantly help with large uploads.
    * although I was originally going to drop output_file_prefix, it's not actually hard to maintain. the behavior is changed now and objects are uploaded as soon as they're outputted rather than after the prediction is completed.
    * there's an ugly hack with uploading an empty body to get the redirect instead of making api time out from trying to upload an 140GB file. that can be fixed by implemented an MPU endpoint and/or a "fetch upload url" endpoint.
    * the behavior of the non-indempotent endpoint is changed; the id is now randomly generated if it's not provided in the body. this isn't strictly required for this change alone, but is hard to carve out.
    * the behavior of Path is changed significantly. see https://www.notion.so/replicate/Cog-Setup-Path-Problem-2fc41d40bcaf47579ccd8b2f4c71ee24
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    Co-authored-by: Mattt <mattt@replicate.com>
    technillogue and mattt authored Mar 29, 2024
    Configuration menu
    Copy the full SHA
    7ee96ba View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2024

  1. format

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed Mar 30, 2024
    Configuration menu
    Copy the full SHA
    9538673 View commit details
    Browse the repository at this point in the history

Commits on May 7, 2024

  1. implement mp.Connection with async streams (#1640)

    * wip
    
    * some tweaks
    
    * ignore some type errors
    
    * test connection roundtrip
    
    * add notes from python source code
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 7, 2024
    Configuration menu
    Copy the full SHA
    bb01c85 View commit details
    Browse the repository at this point in the history

Commits on May 8, 2024

  1. AsyncConcatenateIterator

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed May 8, 2024
    Configuration menu
    Copy the full SHA
    12b0abe View commit details
    Browse the repository at this point in the history
  2. optimize webhook serialization and logging (#1651)

    * optimize webhook serialization and logging
    * optimize logging by binding structlog proxies
    * fix tests
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed May 8, 2024
    Configuration menu
    Copy the full SHA
    08f3780 View commit details
    Browse the repository at this point in the history
  3. tweak names and style

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed May 8, 2024
    Configuration menu
    Copy the full SHA
    c62cf67 View commit details
    Browse the repository at this point in the history

Commits on May 16, 2024

  1. omnibus actual concurrency and major refactor (#1530)

    * add concurrency to config
    
    * more descriptive names for predict functions
    
    * don't cancel from signal handler if a loop is running. expose worker busy state to runner
    
    * move handle_event_stream to PredictionEventHandler
    
    * make setup and canceling work
    
    * keep track of multiple runner prediction tasks to make idempotent endpoint return the same result and fix tests somewhat
    
    * drop Runner._result, comments
    
    * move create_event_handler into PredictionEventHandler.__init__
    
    * break out Path.validate into value_to_path and inline get_filename and File.validate
    
    * split out URLPath into BackwardsCompatibleDataURLTempFilePath and URLThatCanBeConvertedToPath with the download part of URLFile inlined
    
    * let's make DataURLTempFilePath also use convert and move value_to_path back to Path.validate
    
    * drop should_cancel
    
    * prediction->request
    
    * split up predict/inner/prediction_ctx into enter_predict/exit_predict/prediction_ctx/inner_async_predict/predict/good_predict as one way to do it. however, exposing all of those for runner predict enter/coro exit still sucks, but this is still an improvement
    
    * bigish change: inline predict_and_handle_errors
    
    * inline make_error_handler into setup
    
    * move runner.setup into runner.Runner.setup
    
    * add concurrency to config in go
    
    * try explicitly using prediction_ctx __enter__ and __exit__
    
    * relax setup argument requirement to str
    
    * glom worker into runner
    
    * add logging message
    
    * fix prediction retry and improve logging
    
    * split out handle_event
    
    * use CURL_CA_BUNDLE for file upload
    
    * dubious upload fix
    
    * skip worker and webhook tests since those were erroring on removed imports. fix or xfail runner tests
    
    * validate prediction response to raise errors, but return the unvalidated output to avoid converting urls to File/Path
    
    * expose concurrency in healthcheck
    
    * mediocre logging that works like print
    
    * COG_DISABLE_CANCEL to ignore cancelations
    
    * COG_CONCURRENCY_OVERRIDE
    
    * add ready probe as an http route
    
    * encode webhooks only after knowing they will be sent, and bail our of upload type checks early for strs
    
    * don't validate outputs
    
    * add AsyncConcatenateIterator
    
    * should_exit is not actually used by http
    
    * format
    
    * codecov
    
    * describe the remaining problems with this PR and add comments about cancelation and validation
    
    * add a test
    
    ---------
    Signed-off-by: technillogue <technillogue@gmail.com>
    Co-authored-by: Mattt <mattt@replicate.com>
    technillogue authored May 16, 2024
    Configuration menu
    Copy the full SHA
    0ebfc54 View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. fix test (#1669)

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 17, 2024
    Configuration menu
    Copy the full SHA
    fb498be View commit details
    Browse the repository at this point in the history
  2. predict_time_share metric (#1643)

    predict_time_share tracks the portion of the worker's processing time that was dedicated to each individual prediction. the "cost" of each second is split across the predictions running during that second.
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    Co-authored-by: Zeke Sikelianos <zeke@sikelianos.com>
    technillogue and zeke authored May 17, 2024
    Configuration menu
    Copy the full SHA
    f44b67f View commit details
    Browse the repository at this point in the history
  3. function to emit metrics (#1649)

    * function to emit metrics
    * add metrics docs
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 17, 2024
    Configuration menu
    Copy the full SHA
    b35d1f7 View commit details
    Browse the repository at this point in the history
  4. allow setting both max and target concurrency in cog.yaml (#1672)

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 17, 2024
    Configuration menu
    Copy the full SHA
    e32f5cc View commit details
    Browse the repository at this point in the history

Commits on May 22, 2024

  1. predict_time_share needs to be set before sending the completed webho…

    …ok (#1683)
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 22, 2024
    Configuration menu
    Copy the full SHA
    1728da7 View commit details
    Browse the repository at this point in the history
  2. allow disabling time share metric with COG_DISABLE_TIME_SHARE_METRIC

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue committed May 22, 2024
    Configuration menu
    Copy the full SHA
    07eabfd View commit details
    Browse the repository at this point in the history
  3. drop default_target (#1685)

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored May 22, 2024
    Configuration menu
    Copy the full SHA
    0731446 View commit details
    Browse the repository at this point in the history

Commits on May 23, 2024

  1. Set VSCode default formatter to charliermarsh.ruff (#1547)

    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    mattt committed May 23, 2024
    Configuration menu
    Copy the full SHA
    4e48c28 View commit details
    Browse the repository at this point in the history
  2. Update VSCode workspace settings (#1545)

    * Update code actions on save
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Update setting key to ruff.lint.args
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Remove python.linting settings
    
    All settings starting with "python.linting." are deprecated and can be removed from settings.
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Use vscode.json-language-features to format JSON
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Remove prettier and black from recommended extensions
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    ---------
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    mattt committed May 23, 2024
    Configuration menu
    Copy the full SHA
    fe0d2db View commit details
    Browse the repository at this point in the history

Commits on May 25, 2024

  1. fix config schema

    technillogue committed May 25, 2024
    Configuration menu
    Copy the full SHA
    e07dc07 View commit details
    Browse the repository at this point in the history

Commits on May 31, 2024

  1. Backport Secret type to async branch (#1706)

    * Define Secret type (#1546)
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Fix linter errors (#1691)
    
    * ruff --fix
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Update ruff pyproject settings
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Update ruff lint command in Makefile
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    ---------
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    * Run ruff --fix python
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    
    ---------
    
    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    mattt authored May 31, 2024
    Configuration menu
    Copy the full SHA
    8dc4890 View commit details
    Browse the repository at this point in the history
  2. stick a %s on line 190 clients.py (#1707)

    Signed-off-by: Mattt Zmuda <mattt@replicate.com>
    mattt authored May 31, 2024
    Configuration menu
    Copy the full SHA
    c5ad13b View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2024

  1. local upload server can be called cluster.local in addition to .inter…

    …nal (#1714)
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored Jun 4, 2024
    Configuration menu
    Copy the full SHA
    6b6ac5f View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2024

  1. log traceback properly (#1734)

    * log traceback correctly 
    * use repr(exception) instead of str(exception) if str(exception) is blank
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored Jun 12, 2024
    Configuration menu
    Copy the full SHA
    42c6617 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2024

  1. add batch size metric (#1750)

    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored Jun 18, 2024
    Configuration menu
    Copy the full SHA
    adf8894 View commit details
    Browse the repository at this point in the history

Commits on Jul 2, 2024

  1. Revert "should_exit is not actually used by http". It was being used …

    …by uvicorn and I hadn't noticed
    
    This reverts commit 3ca8aec.
    technillogue committed Jul 2, 2024
    Configuration menu
    Copy the full SHA
    b93f233 View commit details
    Browse the repository at this point in the history
  2. fix ruff check

    technillogue committed Jul 2, 2024
    Configuration menu
    Copy the full SHA
    81187f4 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2024

  1. Poison model healthcheck on shutdown

    We have a problem in production where a broken model is not correctly
    shutting down when requested, which means that director comes back up,
    sees a healthy model (status READY/BUSY) and starts sending it new
    predictions, even though it's supposed to be shutting down.
    
    For now, try and improve the situation by poisoning the model
    healthcheck on shutdown. This doesn't solve the underlying problem but
    it should stop us losing more predictions to a known-broken pod.
    nickstenning committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    a4b86cd View commit details
    Browse the repository at this point in the history

Commits on Jul 11, 2024

  1. Fix broken make go-test command

    This was due to conflicts in the dependencies
    
      === Errors
      Error: ../../../go/pkg/mod/github.com/anaskhan96/soup@v1.2.5/soup.go:20:2: missing go.sum entry for module providing package golang.org/x/net/html (imported by github.com/anaskhan96/soup); to add:
      	go get github.com/anaskhan96/soup@v1.2.5
      Error: ../../../go/pkg/mod/github.com/anaskhan96/soup@v1.2.5/soup.go:21:2: missing go.sum entry for module providing package golang.org/x/net/html/charset (imported by github.com/anaskhan96/soup); to add:
      	go get github.com/anaskhan96/soup@v1.2.5
    
    Running `go mod tidy` fixes the issue and this commit contains the
    updated go.mod and go.sum files.
    aron committed Jul 11, 2024
    Configuration menu
    Copy the full SHA
    99a26d4 View commit details
    Browse the repository at this point in the history

Commits on Jul 12, 2024

  1. Configuration menu
    Copy the full SHA
    8d834f0 View commit details
    Browse the repository at this point in the history
  2. Propagate trace context to webhook and upload requests

    Based on the implementation in #1698 for sync cog.
    
    If the request to /predict contains headers `traceparent` and
    `tracestate` defined by w3c Trace Context[^1] then these headers are
    forwarded on to the webhook and upload calls.
    
    This allows observability systems to link requests passing through cog.
    
    [^1]: https://www.w3.org/TR/trace-context/
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    aron authored and technillogue committed Jul 12, 2024
    Configuration menu
    Copy the full SHA
    5d38ae7 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. [async] Include prediction id upload request (#1788)

    * Cast TraceContext into Mapping[str, str] to fix linter
    
    * Include prediction id upload request
    
    Based on #1667
    
    This PR introduces two small changes to the file upload interface.
    
    1. We now allow downstream services to include the destination of the
    asset in a `Location` header, rather than assuming that it's the same as
    the final upload url (either the one passed via `--upload-url` or the
    result of a 307 redirect response.
    
    2. We now include the `X-Prediction-Id` header in upload request, this
    allows the downstream client to potentially do configuration/routing
    based on the prediction ID. This ID should be considered unsafe and
    needs to be validated by the downstream service.
    
    * Extract ChunkFileReader into top-level class
    
    ---------
    
    Co-authored-by: Mattt Zmuda <mattt@replicate.com>
    aron and mattt authored Jul 17, 2024
    Configuration menu
    Copy the full SHA
    b9df82a View commit details
    Browse the repository at this point in the history

Commits on Jul 22, 2024

  1. patch cancel bug: immediately mark predictions as cancelled (#1798)

    * unconditionally mark predictions as cancelled without waiting for cancellation to succeed
    
    while I struggle with #1786, we see queue spikes because not responding to cancellation promptly causes the pod to get restarted. this is a dirty hack to pretend like cancellation works immediately. as soon as we fix the race condition (and possibly any issues with task.cancel() behaving differently from signal handlers), we can drop this.
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    
    * make Mux.write a sync method
    
    this makes the cancel patch cleaner, but might be a small speedup for high-throughput outputs
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    
    * gate this behind a feature flag
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored Jul 22, 2024
    Configuration menu
    Copy the full SHA
    5464c43 View commit details
    Browse the repository at this point in the history

Commits on Jul 23, 2024

  1. syl/fix setup shutdown bug (#1819)

    * start with just changing Exception to BaseException to catch cancellation
    
    * add much more shutdown logging
    technillogue authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    9f49b29 View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2024

  1. Configuration menu
    Copy the full SHA
    acbb283 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. move runner.terminate into runner.shutdown after waiting for predicti…

    …ons to complete (#1843)
    
    
    * move runner.terminate into shutdown, make it async, and document the behavior of Server.stop, should_exit, force_exit, and app shutdown handler,
    * remove BaseException handlers or re-raise
    * fix tests
    
    ---------
    
    Signed-off-by: technillogue <technillogue@gmail.com>
    technillogue authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    0976a05 View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2024

  1. [async] Support URLFile in the upload_file function (#1987)

    We would like `predict` functions to be able to return a remote URL rather than a local file on disk and have it behave like a file object. And when it is passed to the file uploader it will stream the file from the remote to the destination provided.
    
    ```py
    class Predictor(BasePredictor):
        def predict(self, **kwargs) -> File:
            return URLFile("https://replicate.delivery/czjl/9MBNrffKcxoqY0iprW66NF8MZaNeH322a27yE0sjFGtKMXLnA/hello.webp")
    ```
    
    This PR modifies the URLFile class to use `urllib.request.openurl` instead of `requests` as the HTTP client for reading the file data.
    
    The `urllib.response.addinfourl` interface conforms to the one used by `urllib3.response.HTTPResponse` so I don't think we have any gaps here, longer term we probably want to figure out a more reliable shim. But as this branch is mostly used for language models we're probably okay.
    aron authored Oct 16, 2024
    Configuration menu
    Copy the full SHA
    b29c93c View commit details
    Browse the repository at this point in the history
  2. Fix type annotations in Input

    aron committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    3edefe4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    05a13bf View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2024

  1. Support custom filename to be provided to URLFile (#1997)

    This works around an issue where the basename of the URL many not actually
    contain a file extension and the uploader logic cannot infer the mime
    type for the file.
    aron authored Oct 17, 2024
    Configuration menu
    Copy the full SHA
    3805e2e View commit details
    Browse the repository at this point in the history

Commits on Oct 18, 2024

  1. Add support for image/webp to mimetypes package (#2002)

    This has only recently been introduced in Python 3.13.0 and is currently
    inconsistently implemented across different platforms. Confusingly webp
    is supported in local development on macOS but not when building the
    docker image of a cog model. This is either because it's not defined in
    the system mime.types file of the Linux image or because a dev
    dependency is manually adding it. I've not done the work to fully
    understand which.
    
    This commit introduces a function called in the init script for the cog
    package that patches the global mimetypes registry to understand the
    .webp extension and image/webp mime type. This will be a no-op on
    systems that already understand the type.
    
    This fixes a bug whereby files with the .webp extension are uploaded to
    the --upload-url with the incorrect application/octet-stream header.
    aron authored Oct 18, 2024
    Configuration menu
    Copy the full SHA
    12d5a4c View commit details
    Browse the repository at this point in the history
  2. Add .envrc for asdf

    This matches many of our other repositories and will ensure people
    who've bought into that environment will be using the right golang.
    nickstenning committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    a60d151 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b710b33 View commit details
    Browse the repository at this point in the history
  4. Add dotenv_if_exists to .envrc

    This allows developers to set local environment variables in a .env file
    if they wish, without needing to check that in or gitignore it.
    nickstenning committed Oct 18, 2024
    Configuration menu
    Copy the full SHA
    2dcdf64 View commit details
    Browse the repository at this point in the history
  5. Propagate the name attribute of URLFile across serializers (#2000)

    The commit 3805e2e introduced the `filename` keyword argument to the `URLFile` constructor. However we do not correctly propagate that value when the instance is pickled resulting in the `URLFile` that is passed to the upload handler missing that attribute.
    
    This PR updates the code to stash the `name` when pickling and extract it again when unpickling. The `__getattr__` function then supports returning the underlying `name` value rather than proxying to the underlying request object.
    
    I also ran into a small bug whereby the `__del__` method was triggering a network request because of some private attributes being accessed during teardown would trigger the `__wrapper__` code. I've overridden the super class to disable this. Though I'm unclear if this is just the test suite doing this cleanup.
    aron authored Oct 18, 2024
    Configuration menu
    Copy the full SHA
    9cb5f35 View commit details
    Browse the repository at this point in the history