Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tweak lifespan spec to require propagating exceptions thrown from lifespan tasks after 'lifespan.startup.complete' #276

Open
graingert opened this issue Jun 30, 2021 · 7 comments

Comments

@graingert
Copy link
Contributor

graingert commented Jun 30, 2021

currently in quart-trio a failure in the app.nursery is swallowed and the web server continues unabated

https://gitlab.com/pgjones/quart-trio/-/issues/21

once an ASGI app acknowledges a lifespan request (with a lifespan.startup.complete event) the server should treat failures in that task as fatal.

Note: Now that Starlette supports trio, this is currently making writing a Starlette TaskGroup middleware a challenge

@graingert
Copy link
Contributor Author

graingert commented Jun 30, 2021

some other ideas:

A new event

Perhaps there should be an additional event - 'lifespan.startup.acknowledge' which can be sent at any time that signals acceptance of the lifespan scope and that any failures are fatal

A sentinel BaseException

or there could be a:

class LifespanFailure(BaseException):
   pass

and lifespan tasks could:

try:
    with anyio.create_task_group() as app.tg:
        ...
except Exception as e:
    raise asgiref.LifespanFailure from e

New asgi servers could extract the .__cause__ attribute of the LifespanFailure and raise that instead.
this would be backwards compatible because current asgi servers propagate BaseExceptions already

@andrewgodwin
Copy link
Member

I do agree something needs to happen here - it's important to have explicit failure - but not a huge fan of the acknowledge event model, that just seems a little much, along with the weird exception handling, so propagating exceptions after complete seems reasonable?

@graingert
Copy link
Contributor Author

graingert commented Jun 30, 2021

I'd like to be able to propagate exceptions before startup complete too, but that's a bigger spec change. Eg if my task is cancelled I won't be able to send "lifespan.startup.failed"

@andrewgodwin
Copy link
Member

So the specific behaviour you would want to change is that the ASGI server swallows any exceptions during startup:

If an exception is raised when calling the application callable with a lifespan.startup message or a scope with type lifespan, the server must continue but not send any lifespan events.

To have a specific exception that does stop the server? I think we can very reasonably support that in a backwards-compatible way, then.

@graingert
Copy link
Contributor Author

graingert commented Jul 19, 2021

starlette 0.16.0 is now slightly violating the asgi spec here: https://github.com/encode/starlette/blob/e45c5793611bfec606cd9c5d56887ddc67f77c38/starlette/routing.py#L628 and will send "lifespan.shutdown.complete" or "lifespan.shutdown.failed" regardless of how await receive() completes.

Eg if await receive() is completed with any value starlette will reply with "lifespan.shutdown.complete".

importantly: if await receive() throws a CancelledError - if that cancellation was initiated by a lifespan_context TaskGroup or CancelScope starlette will reply with "lifespan.shutdown.complete" even through no shutdown has actually been initiated.
additionally if that cancellation was initiated by an exception being thrown into a lifespan_context TaskGroup, starlette propagates it with "lifespan.shutdown.failed"

@graingert
Copy link
Contributor Author

here's an issue where this spec violation bites:

https://gitlab.com/pgjones/hypercorn/-/merge_requests/56/diffs

@graingert graingert changed the title tweak lifespan spec to require propagting exceptions thrown from lifespan tasks after 'lifespan.startup.complete' tweak lifespan spec to require propagating exceptions thrown from lifespan tasks after 'lifespan.startup.complete' Jul 27, 2021
@peterschutt
Copy link

I've been looking into this issue recently for Litestar and it would be great to continue the conversation.

Like others, we have a lifespan context manager that is entered on "lifespan.startup", and exited on "lifespan.shutdown". User's can do anything within this context, including spinning up asynchronous tasks. We even document it for that purpose (ref):

This can be useful when dealing with long running tasks

and starlette seems to promote it for similar (ref):

Consider using anyio.create_task_group() for managing asynchronous tasks.

If an exception is raised within that context but after we have sent "lifespan.startup.complete", should we raise it, emit "lifespan.shutdown.failed", or catch it, log it and keep the lifespan task running? We currently send "lifespan.shutdown.failed" and then re-raise the exception, however servers continue to serve after this (tested this against uvicorn, granian, hypercorn).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants