Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging chokes on non-string keys in WSGI environ. #2234

Closed
Oberon00 opened this issue Jan 10, 2020 · 10 comments
Closed

Logging chokes on non-string keys in WSGI environ. #2234

Oberon00 opened this issue Jan 10, 2020 · 10 comments

Comments

@Oberon00
Copy link

Forwarding a bug reported by @disfluxly as open-telemetry/opentelemetry-python#354:

This was with gunicorn==20.0.4. If the WSGI environ dictionary contains a non-string key (in this case a plain object value), the following traceback results:

Traceback (most recent call last):
  File "/Users/disflux/miniconda3/envs/flasktest/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 184, in handle_request
    self.log.access(resp, req, environ, request_time)
  File "/Users/disflux/miniconda3/envs/flasktest/lib/python3.6/site-packages/gunicorn/glogging.py", line 341, in access
    request_time))
  File "/Users/disflux/miniconda3/envs/flasktest/lib/python3.6/site-packages/gunicorn/glogging.py", line 323, in atoms
    atoms.update({"{%s}e" % k.lower(): v for k, v in environ_variables})
  File "/Users/disflux/miniconda3/envs/flasktest/lib/python3.6/site-packages/gunicorn/glogging.py", line 323, in <dictcomp>
    atoms.update({"{%s}e" % k.lower(): v for k, v in environ_variables})
AttributeError: 'object' object has no attribute 'lower'

Please see open-telemetry/opentelemetry-python#354 for a more details (kudos to @disfluxly for the detailed report).

Note that the WSGI spec does not demand anywhere that the environ may only contain string keys and even explicitly says:

the application is allowed to modify the dictionary in any way it desires

(https://www.python.org/dev/peps/pep-3333/#specification-details)

@benoitc
Copy link
Owner

benoitc commented Jan 10, 2020

well the environ is also expected to contain CGI-style environment variables where there key are always a string. I am not sure we should accept an arbitrary content there. It can be costly to check if the key is a string or not. I need to think a little about it, as the case never happened in the last 11 years.

Also unsure why opentelemetry is using objects there, that kind of weird... the way i would have done would have bene adding a "telemetry." string key with a custom object there.

@Oberon00
Copy link
Author

Oberon00 commented Jan 10, 2020

Yes, we have the report still open as a bug in OpenTelemetry and are considering working around it. But I think according to PEP 3333, OpenTelemetry is technically correct and gunicorn technically has a bug. In practice It might very well be the case that all keys being strings has become a de-facto requirement.

@benoitc
Copy link
Owner

benoitc commented Jan 10, 2020

@Oberon00 rereading the spec it says explicitly

Finally, the environ dictionary may also contain server-defined variables. These variables should be named using only lower-case letters, numbers, dots, and underscores, and should be prefixed with a name that is unique to the defining server or gateway. For example, mod_python might define variables with names like mod_python.some_variable.

IMO gunicorn has the correct implementation there. I will let the ticket open to see if others can have a different opinion.

@benoitc benoitc self-assigned this Jan 10, 2020
@Oberon00
Copy link
Author

This is only talking about server variables. I don't think OpenTelemetry should be considered a server and "the application is allowed to modify the dictionary in any way it desires". From the view of gunicorn, any WSGI-callable is an application.

@Oberon00
Copy link
Author

Anyway, if you think implementing this would be too costly, it should not be a real problem for OpenTelemetry to fix this on our side.

@benoitc
Copy link
Owner

benoitc commented Jan 13, 2020

@Oberon00 the server part is quite mixed in the environ setting. I would personnaly default to a string as key.

I'm very interrested myself to have native opentelemetry support in Gunicorn, so if I can help, let me know :)

@toumorokoshi
Copy link

awesome! I've filed a quick issue on opentelemetry-gunicorn integration here: https://github.com/open-telemetry/opentelemetry-python/issues/365. Although thinking through it theoretically it should be seamless.

@tilgovi
Copy link
Collaborator

tilgovi commented Apr 23, 2020

It sounds like we should close this, but if the cost of calling str() on each key is not prohibitive (should be very low for strings) then maybe that's a solution? Checking the type should not be expensive, but then a small performance cost adds up quickly when you're logging a lot.

@toumorokoshi
Copy link

I'll call out that we no longer have this issue on the opentelemetry side (using strings for sentinel values), so aside from correctness, there's no need to modify this behavior for opentelemetry's sake.

@tilgovi
Copy link
Collaborator

tilgovi commented Apr 23, 2020

I'll close this issue and it can be here for posterity. If it comes up again, maybe we can be motivated to address it. Thanks for the followup, @toumorokoshi.

@tilgovi tilgovi closed this as completed Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants