SyncWorker.wait() method returns None #983

awol · 2015-02-17T03:44:00Z

I am running gunicorn 19.2.0 serving Django 1.7.4 using nginx as a proxy on Centos7.

The wait() method of the SyncWorker class is getting a tuple of empty lists from the call to select.select() as a result of the timeout passing, and, as such falling through to the end of the function and returning None to the calling function. This calling function (run_for_multiple in my case) then raises an exception as it tries to iterate over the result of the call to wait().

As a result the worker then dies and then is rebooted by the master. Since I am in a low volume world, this is happening for each worker every timeout seconds (15.0) and really filling my logs.

Is this the expected behaviour? Perhaps this is an artefact of the result of the select() call not raising an exception when the timeout expires when elsewhere it would.

My work around is to return an empty list at the end of the wait method. An alternative would be to return self.sockets as would be done in the event of an EINTR exception from the select.select() call. I am not sure which is more idiomatic (if indeed either) for the gunicorn model as I have not had the opportunity to investigate the impact further up the stack of the workaround. However the empty list approach is working well for me (apparently).

The text was updated successfully, but these errors were encountered:

tilgovi · 2015-02-17T05:38:10Z

Added to R19.3 milestone because this is not a pleasant experience at all and makes it seem like something is very wrong when it is not. Thanks for the detailed report.

benoitc · 2015-02-18T09:02:37Z

If we provide all the sockets here, it would means we could miss a connection on one of the sockets. Imo a better way would be to returning in the looop if no socket is ready to accept and wait for at least one.

One possible issue by doing this is the thundering herd problem, but that could be handled later. Thoughts?

jfarrimo · 2015-03-06T03:16:25Z

I'm encountering this problem as well. One side-effect is that clients making http requests to gunicorn occasionally get "socket hang up" errors because there are apparently no free workers to service the request. This isn't a giant problem for me, and doesn't happen very frequently, but it does show that this problem has real-world implications and is not totally benign. I'm eagerly awaiting a new gunicorn release that fixes this.

tilgovi added this to the R19.3 milestone Feb 17, 2015

benoitc modified the milestones: R19.2 19.2.1, R19.3 Feb 18, 2015

benoitc closed this as completed in 803a2d7 Mar 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SyncWorker.wait() method returns None #983

SyncWorker.wait() method returns None #983

awol commented Feb 17, 2015

tilgovi commented Feb 17, 2015

benoitc commented Feb 18, 2015

jfarrimo commented Mar 6, 2015

SyncWorker.wait() method returns None #983

SyncWorker.wait() method returns None #983

Comments

awol commented Feb 17, 2015

tilgovi commented Feb 17, 2015

benoitc commented Feb 18, 2015

jfarrimo commented Mar 6, 2015