Review Gunicorn worker type #1062

SpacemanPaul · 2024-08-12T22:11:43Z

Description

PR #1061 removed the --threads argument to gunicorn as it was incompatible with the chosen worker type (gevent) and was not being read.

Should we switch to the gthreads worker-type and reinstate the --threads option?

AFAIAA the only thread-safety issue in OWS is the use of non-re-entrant matplotlib code in automated legend generation, which could easily be wrapped in a thread lock. Given that legend requests make up only a tiny fraction of incoming requests in normal use, a thread lock here shouldn't be a problem.

The text was updated successfully, but these errors were encountered:

omad · 2024-08-12T23:37:58Z

100% it was dumb to have --threads with the gevent worker type! I thought I'd removed it months ago, but it must have been in Explorer, not here.

There's issues beyond thread safety, we also need to actually get better concurrency. At the least, anything that's doing blocking IO needs to release the GIL.

We can postulate as much as we want, but the only way to know is empirical testing.

Adding a thread lock to the legend generation sounds wise!

benjimin · 2024-08-13T00:33:20Z

We absolutely ought to be profiling OWS in production and/or using OWS benchmarks to decide.

The gevent choice is meant to monkey-patch our code to facilitate asynchronous python sockets (as async coroutines supposedly work very well for python networking), but I'm guessing (having not profiled it) that a lot of OWS handler latency is spend on network exchanges done inside of compiled GDAL or postgres libraries (which gevent.monkey might fail to wrap and instead block on?).

If GDAL (and the postgres driver) are now threadsafe, then native threads (with a lot more than the previous 2 threads per worker, probably limited by raster memory requirements?) might help ensure the pod doesn't waste resources leaving its CPU waiting on IO (when saturated with requests). If not, could experiment with much higher numbers of simple (e.g. sync) worker processes..

SpacemanPaul · 2024-08-13T01:08:25Z

Postgres index driver is thread-safe*. According to rasterio, GDAL is pretty good about releasing the GIL: https://rasterio.readthedocs.io/en/stable/topics/concurrency.html

In OWS 1.8, the index driver has a whole other layer of multi-thread-proofing that does nothing useful (and possibly never did) - this is removed in OWS 1.9.

benjimin · 2024-08-13T04:49:32Z

Releasing the GIL is one thing, but is it thread safe? (I thought you used to not be able to read two different files in different threads because a lot of GDAL API settings were globals?)

https://gdal.org/user/multithreading.html

Unless otherwise stated, no GDAL public C functions and C++ methods should be assumed to be thread-safe.

benjimin · 2024-08-13T05:06:19Z

I think we should create a regression test in this repo, with a dummy index in a postgres container and a dummy data collection (maybe hosted by nginx or maybe some mock S3), and then have the test-runner artificially add significant latency to all packets exchanged during the test (so instead of the test executing instantly the database query takes 2s, each raster retrieval takes 2s), and hit the ows container with external async requests so that we can time the concurrent request handling (and fail if it is as slow as handling requests sequentially).

For nginx hosted assets you could just use an echo_sleep module, but to cover all services we could try something similar to tc qdisc add dev eth0 root netem delay 2000ms (maybe lo instead of eth0, would need to fiddle more..). Actually, here's an example:

https://medium.com/@kazushi/simulate-high-latency-network-using-docker-containerand-tc-commands-a3e503ea4307

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review Gunicorn worker type #1062

Review Gunicorn worker type #1062

SpacemanPaul commented Aug 12, 2024 •

edited

Loading

omad commented Aug 12, 2024

benjimin commented Aug 13, 2024

SpacemanPaul commented Aug 13, 2024 •

edited

Loading

benjimin commented Aug 13, 2024

benjimin commented Aug 13, 2024 •

edited

Loading

Review Gunicorn worker type #1062

Review Gunicorn worker type #1062

Comments

SpacemanPaul commented Aug 12, 2024 • edited Loading

Description

omad commented Aug 12, 2024

benjimin commented Aug 13, 2024

SpacemanPaul commented Aug 13, 2024 • edited Loading

benjimin commented Aug 13, 2024

benjimin commented Aug 13, 2024 • edited Loading

SpacemanPaul commented Aug 12, 2024 •

edited

Loading

SpacemanPaul commented Aug 13, 2024 •

edited

Loading

benjimin commented Aug 13, 2024 •

edited

Loading