-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review Gunicorn worker type #1062
Comments
100% it was dumb to have There's issues beyond thread safety, we also need to actually get better concurrency. At the least, anything that's doing blocking IO needs to release the GIL. We can postulate as much as we want, but the only way to know is empirical testing. Adding a thread lock to the legend generation sounds wise! |
We absolutely ought to be profiling OWS in production and/or using OWS benchmarks to decide. The gevent choice is meant to monkey-patch our code to facilitate asynchronous python sockets (as async coroutines supposedly work very well for python networking), but I'm guessing (having not profiled it) that a lot of OWS handler latency is spend on network exchanges done inside of compiled GDAL or postgres libraries (which If GDAL (and the postgres driver) are now threadsafe, then native threads (with a lot more than the previous 2 threads per worker, probably limited by raster memory requirements?) might help ensure the pod doesn't waste resources leaving its CPU waiting on IO (when saturated with requests). If not, could experiment with much higher numbers of simple (e.g. sync) worker processes.. |
Postgres index driver is thread-safe*. According to rasterio, GDAL is pretty good about releasing the GIL: https://rasterio.readthedocs.io/en/stable/topics/concurrency.html In OWS 1.8, the index driver has a whole other layer of multi-thread-proofing that does nothing useful (and possibly never did) - this is removed in OWS 1.9. |
Releasing the GIL is one thing, but is it thread safe? (I thought you used to not be able to read two different files in different threads because a lot of GDAL API settings were globals?) https://gdal.org/user/multithreading.html
|
I think we should create a regression test in this repo, with a dummy index in a postgres container and a dummy data collection (maybe hosted by nginx or maybe some mock S3), and then have the test-runner artificially add significant latency to all packets exchanged during the test (so instead of the test executing instantly the database query takes 2s, each raster retrieval takes 2s), and hit the ows container with external async requests so that we can time the concurrent request handling (and fail if it is as slow as handling requests sequentially). For nginx hosted assets you could just use an echo_sleep module, but to cover all services we could try something similar to |
Description
PR #1061 removed the
--threads
argument to gunicorn as it was incompatible with the chosen worker type (gevent
) and was not being read.Should we switch to the
gthreads
worker-type and reinstate the--threads
option?AFAIAA the only thread-safety issue in OWS is the use of non-re-entrant matplotlib code in automated legend generation, which could easily be wrapped in a thread lock. Given that legend requests make up only a tiny fraction of incoming requests in normal use, a thread lock here shouldn't be a problem.
The text was updated successfully, but these errors were encountered: