Enable graceful shutdown when running multiple workers and sending a SIGTERM #853

euri10 · 2020-11-18T08:07:12Z

Fixes #852

This is quite a huge review, so happy to explain anything that might look not easily understandable.

As @florimondmanca noticed, it takes inspiration from hypercorn by handling the multiple process shutdown through an external multiprocessing.Event that defines an infinite loop. That loop will break when that external event is set.

Where it deviates from hypercorn implementation is in the reloader case : I took the view that --reload is a particular case of --workers where there is a restart method that is invoked when a file is modified.

If necessary I can comment on the diff the main takeaways, let me know. It took me quite a while to get it right and hopefully it's working, I hope there are not leftovers from previous attempts.

one pretty big caveat but it's maybe doable, it removes support for python 3.6 : I'm using server.serve_forever() which appears in 3.7 and couldn't for this find a way to handle its absence in 3.6 (there is another case where it was quite easy to adapt for 3.6) not using server.serve_forever in fact

log of a gentle sigterm

/home/lotso/PycharmProjects/uvicorn/venv/bin/python -m uvicorn apps.app:app --workers 2 --log-level=debug
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started parent process [16790]
DEBUG:    run args:() kwargs:{'config': <uvicorn.config.Config object at 0x7fcff87073a0>, 'shutdown_event': <multiprocessing.synchronize.Event object at 0x7fcff78028e0>}
DEBUG:    setting multiprocess trigger using : <multiprocessing.synchronize.Event object at 0x7fcff78028e0>
DEBUG:    run args:() kwargs:{'config': <uvicorn.config.Config object at 0x7fcff87073a0>, 'shutdown_event': <multiprocessing.synchronize.Event object at 0x7fcff78028e0>}
DEBUG:    setting multiprocess trigger using : <multiprocessing.synchronize.Event object at 0x7fcff78028e0>
INFO:     Started server process [16791]
INFO:     Waiting for application startup.
INFO:     Started server process [16792]
INFO:     Waiting for application startup.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.
INFO:     going to await shutdown
INFO:     going to await shutdown
DEBUG:    MultiServer received: 15
DEBUG:    multiprocessing event set
INFO:     will raise shutdown
DEBUG:    multiprocessing event set
INFO:     will raise shutdown
DEBUG:    raised shutdown exc: 
INFO:     Shutting down
DEBUG:    raised shutdown exc: 
INFO:     Shutting down
INFO:     Finished server process [16791]
INFO:     Finished server process [16792]
INFO:     Stopping parent process [16790]

Process finished with exit code 0

tests/middleware/test_trace_logging.py

Add a signal event to the run_server context manager

euri10 · 2020-12-30T20:10:19Z

I rebased this against master with new tests, adapted the run_server context manager to cope with how signals are handled in this version.
I tested gunicorn, uvicorn reload on both flavors, multiple uvicorn workers, sigterm and sigint gracefully shutdown the server and all its workers, no more hanging.

I'm quite happy with it, except for the 3.6 drop, but maybe we can keep it for after the handler change, since I dropped 3.6 because I'm using server.serve_forever()

In any case, after or before, the logic at hand wont change much and it's a pretty neat addition.
having clean shutdown is cool for orchestrators mostly, killing pods will likely be way faster with this

florimondmanca · 2020-12-30T22:12:06Z

@euri10 This is looking interesting, but I must say that's quite a lot of code and changes to go through. It seems you're kind of doing the same trick (using a multiprocessing event) three times. Would there be any chance you could start with only one bit, say hot reload (or whatever is easiest to add first), so we can see more easily what the various pieces are? Just asking, if it's all interlinked then okay, I can try and take the time to sit down and go through this 😄 but if there are ways to scale things down in increments, that would be interesting too...

florimondmanca · 2020-12-31T11:43:50Z

@euri10 Taking a closer look at this PR, I view the introduction of self.tasks and other server-coupled state as a rather bad smell when considering support for other async libraries.

I have a feeling a prerequisite for this would be to start moving asyncio-specific pieces out of Server, and switch Server slightly so that it only exposes a await serve() coroutine that does the startup/shutdown behavior, rather than exposing .startup() and .shutdown() methods. This allows managing serve-local state much more easily (with context managers). I'm basing all this on what I found while working on #863 — and I think a lot of what's there could be used for inspiration.

euri10 · 2021-01-01T08:44:13Z

Ok will wait then, not sure sure how to 🍕 slice it right now but eventually this will come.

gnat · 2021-01-11T01:39:06Z

I've also given the changes a review, it looks good to me. Also it's a net addition of only 30 lines.

I tested gunicorn, uvicorn reload on both flavors, multiple uvicorn workers, sigterm and sigint gracefully shutdown the server and all its workers, no more hanging.

This is a major win for uvicorn infrastructure and solves a serious pain point.

I'm quite happy with it, except for the 3.6 drop, but maybe we can keep it for after the handler change, since I dropped 3.6 because I'm using server.serve_forever()

This should NOT be a concern because:

Python async is so new that depriving ourselves of newer versions can have serious ramifications on the progress of our entire ecosystem.
Anyone concerned about graceful shutdown are the serious users of Uvicorn.
Those honestly stuck on 3.6 in production wont be upgrading their other dependencies anyway. And highly doubtful they are running uvicorn given the problem this patch solves in the first place.

Thank you @euri10 for the great work here and @florimondmanca for the feedback!!

We should do what we can to move this feature forward.

deltarod · 2021-08-11T20:16:53Z

Is there an ETA on this merge?

danrossi · 2021-12-22T03:40:13Z

I am also suffering this same problem. Cannot exit properly with sigterm if workers are used on Windows. Is there a fix?

temoto · 2022-02-21T11:46:52Z

@euri10 what was wrong? a better patch is coming?

euri10 changed the title ~~All graceful shutdown when running multiple workers and sending a SIGTERM~~ Enable graceful shutdown when running multiple workers and sending a SIGTERM Nov 18, 2020

euri10 mentioned this pull request Nov 23, 2020

Trio support. #169

Closed

florimondmanca reviewed Nov 23, 2020

View reviewed changes

tests/middleware/test_trace_logging.py Outdated Show resolved Hide resolved

euri10 marked this pull request as ready for review December 17, 2020 14:23

euri10 requested a review from florimondmanca December 17, 2020 14:23

euri10 mentioned this pull request Dec 20, 2020

Add --reload flag #98

Closed

euri10 mentioned this pull request Dec 28, 2020

Add check to multiprocess supervisor #641

Closed

Rebase

93d7596

euri10 force-pushed the gentle_sigterm branch from bb4234f to 93d7596 Compare December 30, 2020 17:13

euri10 added 7 commits December 30, 2020 18:15

Reduce useless changes

0a6b64a

Merge branch 'master' into gentle_sigterm

7ab81f8

No more CustomServer since we use a context manager.

81caf55

Add a signal event to the run_server context manager

Reduce diff post merge

c8045b4

Cleant init

565a5d0

Tend to add lot of stuff trying things, just cleaning

f107934

Leftovers again

84e0a3a

florimondmanca mentioned this pull request Dec 31, 2020

Isolate server started message #930

Merged

euri10 mentioned this pull request Jan 10, 2021

Sending SIGTERM to parent process when running with --workers hangs indefinitely #852

Closed

2 tasks

Merge branch 'master' into gentle_sigterm

8b1651b

euri10 mentioned this pull request Jan 12, 2021

Gunicorn+UvicornWorker,access log not working after logrotate access log file. #896

Closed

euri10 added 4 commits January 12, 2021 17:21

Cleaner messages

31ebb6b

Move tasks in main loop

7fa1771

No serve_forever >> 3.6 available

e8adab9

Reduce diff

39b7ad4

euri10 mentioned this pull request Jan 13, 2021

Uvicorn with reload hangs when using a ProcessPoolExecutor #936

Closed

2 tasks

zhfkt mentioned this pull request Mar 11, 2021

How to gracefully stop FastAPI app ? fastapi/fastapi#2928

Closed

raphaelauv mentioned this pull request May 8, 2021

How to properly close a background child process in a shutdown event? fastapi/fastapi#2025

Closed

9 tasks

erl987 mentioned this pull request Oct 9, 2021

Uvicorn not gracefully shutting down in Docker container #1215

Closed

2 tasks

euri10 closed this Feb 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable graceful shutdown when running multiple workers and sending a SIGTERM #853

Enable graceful shutdown when running multiple workers and sending a SIGTERM #853

euri10 commented Nov 18, 2020 •

edited

Loading

euri10 commented Dec 30, 2020

florimondmanca commented Dec 30, 2020 •

edited

Loading

florimondmanca commented Dec 31, 2020

euri10 commented Jan 1, 2021

gnat commented Jan 11, 2021 •

edited

Loading

deltarod commented Aug 11, 2021

danrossi commented Dec 22, 2021

temoto commented Feb 21, 2022

Enable graceful shutdown when running multiple workers and sending a SIGTERM #853

Enable graceful shutdown when running multiple workers and sending a SIGTERM #853

Conversation

euri10 commented Nov 18, 2020 • edited Loading

euri10 commented Dec 30, 2020

florimondmanca commented Dec 30, 2020 • edited Loading

florimondmanca commented Dec 31, 2020

euri10 commented Jan 1, 2021

gnat commented Jan 11, 2021 • edited Loading

deltarod commented Aug 11, 2021

danrossi commented Dec 22, 2021

temoto commented Feb 21, 2022

euri10 commented Nov 18, 2020 •

edited

Loading

florimondmanca commented Dec 30, 2020 •

edited

Loading

gnat commented Jan 11, 2021 •

edited

Loading