Skip to content

Commit

Permalink
core/uwsgi: graceful stop worker when max_requests/reload_on_*
Browse files Browse the repository at this point in the history
worker stops when reached max_requests or reload_on_*.

https://github.com/unbit/uwsgi/blob/39f3ade88c88693f643e70ecf6c36f9b375f00a2/core/utils.c#L1216-L1251

`goodbye_cruel_world()` is not graceful. It caused `atexit` not called.
If atexit stops daemon threads, worker won't stop until killed from master.

Using a reproducer similar to tests/threads_atexit.py:

*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 93920)
spawned uWSGI worker 1 (pid: 93921, cores: 80)
...The work of process 93921 is done (max requests reached (641 >= 20)). Seeya!
worker 1 killed successfully (pid: 93921)
Respawned uWSGI worker 1 (new pid: 94019)
...The work of process 94019 is done (max requests reached (721 >= 20)). Seeya!
worker 1 killed successfully (pid: 94019)
Respawned uWSGI worker 1 (new pid: 94099)
...The work of process 94099 is done (max requests reached (721 >= 20)). Seeya!
worker 1 killed successfully (pid: 94099)
Respawned uWSGI worker 1 (new pid: 94179)
...The work of process 94179 is done (max requests reached (721 >= 20)). Seeya!
worker 1 killed successfully (pid: 94179)
Respawned uWSGI worker 1 (new pid: 94260)
...The work of process 94260 is done (max requests reached (721 >= 20)). Seeya!
worker 1 killed successfully (pid: 94260)
Respawned uWSGI worker 1 (new pid: 94340)

atexit is not called.

*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 94781)
spawned uWSGI worker 1 (pid: 94782, cores: 80)
...The work of process 94782 is done (max requests reached (402 >= 20)). Seeya!
on_exit: uwsgi.worker_id()=1
worker 1 killed successfully (pid: 94782)
Respawned uWSGI worker 1 (new pid: 94880)
...The work of process 94880 is done (max requests reached (721 >= 20)). Seeya!
on_exit: uwsgi.worker_id()=1
worker 1 killed successfully (pid: 94880)
Respawned uWSGI worker 1 (new pid: 94960)
...The work of process 94960 is done (max requests reached (721 >= 20)). Seeya!
on_exit: uwsgi.worker_id()=1
worker 1 killed successfully (pid: 94960)
Respawned uWSGI worker 1 (new pid: 95040)
...The work of process 95040 is done (max requests reached (721 >= 20)). Seeya!
on_exit: uwsgi.worker_id()=1
worker 1 killed successfully (pid: 95040)
Respawned uWSGI worker 1 (new pid: 95120)
...The work of process 95120 is done (max requests reached (721 >= 20)). Seeya!
on_exit: uwsgi.worker_id()=1
worker 1 killed successfully (pid: 95120)
Respawned uWSGI worker 1 (new pid: 95200)

atexit is called

Related issue:

open-telemetry/opentelemetry-python#3640
  • Loading branch information
methane authored and xrmx committed Apr 7, 2024
1 parent 8a47ea9 commit 06a2259
Show file tree
Hide file tree
Showing 4 changed files with 91 additions and 8 deletions.
3 changes: 3 additions & 0 deletions core/loop.c
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ void *uwsgi_get_loop(char *name) {

void simple_loop() {
uwsgi_loop_cores_run(simple_loop_run);
// Other threads may still run. Make sure they will stop.
uwsgi.workers[uwsgi.mywid].manage_next_request = 0;

if (uwsgi.workers[uwsgi.mywid].shutdown_sockets)
uwsgi_shutdown_all_sockets();
}
Expand Down
23 changes: 15 additions & 8 deletions core/uwsgi.c
Original file line number Diff line number Diff line change
Expand Up @@ -1224,6 +1224,10 @@ void wait_for_threads() {
if (ret) {
uwsgi_log("pthread_join() = %d\n", ret);
}
else {
// uwsgi_worker_is_busy() should not consider this thread as busy.
uwsgi.workers[uwsgi.mywid].cores[i].in_request = 0;
}
}
}

Expand Down Expand Up @@ -1287,15 +1291,13 @@ void end_me(int signum) {
exit(UWSGI_END_CODE);
}

void simple_goodbye_cruel_world() {

if (uwsgi.threads > 1 && !uwsgi_instance_is_dying) {
wait_for_threads();
}

static void simple_goodbye_cruel_world() {
int prev = uwsgi.workers[uwsgi.mywid].manage_next_request;
uwsgi.workers[uwsgi.mywid].manage_next_request = 0;
uwsgi_log("...The work of process %d is done. Seeya!\n", getpid());
exit(0);
if (prev) {
// Avoid showing same message from all threads.
uwsgi_log("...The work of process %d is done. Seeya!\n", getpid());
}
}

void goodbye_cruel_world() {
Expand Down Expand Up @@ -3619,6 +3621,11 @@ void uwsgi_ignition() {
}
}

// main thread waits other threads.
if (uwsgi.threads > 1) {
wait_for_threads();
}

// end of the process...
end_me(0);
}
Expand Down
44 changes: 44 additions & 0 deletions tests/threads_atexit.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# https://github.com/unbit/uwsgi/pull/2615
# atexit should be called when reached max-requests.
#
# Start this app:
#
# $ ./uwsgi --http-socket :8000 --master -L --wsgi-file=tests/threads_atexit.py \
# --workers 1 --threads 32 --max-requests 40 --min-worker-lifetime 6 --lazy-apps
#
# Access to this app with hey[1]:
#
# # Do http access for 5 minutes with 32 concurrency
# $ ./hey -c 32 -z 5m 'http://127.0.0.1:8000/'
#
# Search how many stamp files:
#
# $ ls uwsgi_worker*.txt | wc -l
# 39 # should be 0
#
# [1] https://github.com/rakyll/hey

import atexit
import os
import sys
import time


pid = os.getpid()
stamp_file = f"./uwsgi_worker{pid}.txt"


with open(stamp_file, "w") as f:
print(time.time(), file=f)


@atexit.register
def on_finish_worker():
print(f"removing {stamp_file}", file=sys.stderr)
os.remove(stamp_file)


def application(env, start_response):
time.sleep(1)
start_response('200 OK', [('Content-Type', 'text/html')])
return [b"Hello World"]
29 changes: 29 additions & 0 deletions tests/threads_heavy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# https://github.com/unbit/uwsgi/pull/2615
# CPU heavy application in multi threaded uWSGI doesn't shutdown gracefully.
#
# $ ./uwsgi \
# --wsgi-file=threads_heavy.py --master --http-socket=:8000 \
# --workers=4 --threads=8 --max-requests=20 --min-worker-lifetime=6 -L \
# --worker-reload-mercy=20 2>&1 | tee uwsgi.log
#
# $ hey -c 16 -z 3m 'http://127.0.0.1:8000/'
#
# $ grep MERCY uwsgi.log
# Tue Mar 19 14:01:59 2024 - worker 1 (pid: 62113) is taking too much time to die...NO MERCY !!!
# Tue Mar 19 14:02:23 2024 - worker 2 (pid: 62218) is taking too much time to die...NO MERCY !!!
# ...
#
# This was caused by pthread_cancel() is called from non-main thread.

def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)


def application(env, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
n = 24
r = fibonacci(n)
s = f"F({n}) = {r}"
return [s.encode()]

0 comments on commit 06a2259

Please sign in to comment.