Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues surrounding X-Forwarded-For header in ProxyHeadersMIddleware #701

Merged
merged 1 commit into from
Nov 12, 2020

Conversation

tristan
Copy link
Contributor

@tristan tristan commented Jun 17, 2020

Attempting to fix #700

  • Returns 400 if the X-Forwarded-For header cannot be ascii decoded as (i believe) this is a client error rather than a server one.
  • Takes the first value from the list of addresses rather than the last (while there's no actual standard for this header, both MDN and wikipedia list the client always being the first, followed by any other proxies used) Removed as i think it should be handled in a separate PR.

I'm by no means an expert in how to write sensible asgi code yet, I just took what I could find from other parts of the code to make the 400 response here. If it's completely wrong I'm perfectly comfortable being told how to do it properly (or simply closing this PR in favour of a more appropriate one done by someone else)

@euri10
Copy link
Member

euri10 commented Jul 22, 2020

not sure about this fix.

from your snippet in the issue we have:


PyDev console: starting.
Python 3.7.6 (default, Mar 10 2020, 21:50:25) 
[GCC 8.3.0] on linux
atk = b'}__test|O:21:"JDatabaseDriverMysqli":3:{s:2:"fc";O:17:"JSimplepieFactory":0:{}s:21:"\\0\\0\\0disconnectHandlers";a:1:{i:0;a:2:{i:0;O:9:"SimplePie":5:{s:8:"sanitize";O:20:"JDatabaseDriverMysql":0:{}s:8:"feed_url";s:56:"die(md5(DIRECTORY_SEPARATOR));JFactory::getConfig();exit";s:19:"cache_name_function";s:6:"assert";s:5:"cache";b:1;s:11:"cache_class";O:20:"JDatabaseDriverMysql":0:{}}i:1;s:4:"init";}}s:13:"\\0\\0\\0connection";b:1;}\xf0\xfd\xfd\xfd'
atk.decode('ascii')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 427: ordinal not in range(128)
atk.decode('latin-1')
'}__test|O:21:"JDatabaseDriverMysqli":3:{s:2:"fc";O:17:"JSimplepieFactory":0:{}s:21:"\\0\\0\\0disconnectHandlers";a:1:{i:0;a:2:{i:0;O:9:"SimplePie":5:{s:8:"sanitize";O:20:"JDatabaseDriverMysql":0:{}s:8:"feed_url";s:56:"die(md5(DIRECTORY_SEPARATOR));JFactory::getConfig();exit";s:19:"cache_name_function";s:6:"assert";s:5:"cache";b:1;s:11:"cache_class";O:20:"JDatabaseDriverMysql":0:{}}i:1;s:4:"init";}}s:13:"\\0\\0\\0connection";b:1;}ðýýý'

After a quick search about headers encoding, it seems gunicorn dealt with that by encoding headers to latin-1 but I have to admit I read the issue quite rapidly, one good way to advance on the topic would be to summarize their thinking process and if we should / could mimic it ?

maybe others would have better ideas ?

benoitc/gunicorn#1778

@tristan
Copy link
Contributor Author

tristan commented Jul 22, 2020

You're right.
looking at the handling of headers in gunicorn, they use latin-1 to decode all their headers.

https://github.com/benoitc/gunicorn/blob/ed901637ff054939902ff2b1e7633a8cef4762f2/gunicorn/http/message.py#L67
which uses this method

def bytes_to_str(b):
    if isinstance(b, str):
        return b
    return str(b, 'latin1')

So it seems that would be the correct handling. As far as i can tell there is no additional handling of decode errors.

Additionally, the uvicorn wsgi middleware is already using latin-1 to decode the headers, so it probably makes sense to follow suit and do it here as well.

I'll update the PR

@adamzap
Copy link

adamzap commented Nov 10, 2020

Is there anything I can do to help move this pull request forward? I'm having the same problem that the author described in #700, and this seems to fix the issue. Thanks!

@euri10 euri10 requested a review from cdeler November 11, 2020 10:33
Copy link
Member

@euri10 euri10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, would like another pair of eyes though to be sure it has no unintended consequences,, the change of codec seems legit to me given what has been already said on it in the conversation, but better safe than sorry

@adamzap
Copy link

adamzap commented Nov 11, 2020

Thanks for the quick response! I totally understand.

Copy link
Member

@cdeler cdeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
A small Flask enpoint eats this curl request:

curl -v -H 'X-Forwarded-For: }__test|O:21:"JDatabaseDriverMysqli":3:{s:2:"fc";O:17:"JSimplepieFactory":0:{}s:21:"\\0\\0\\0disconnectHandlers";a:1:{i:0;a:2:{i:0;O:9:"SimplePie":5:{s:8:"sanitize";O:20:"JDatabaseDriverMysql":0:{}s:8:"feed_url";s:56:"die(md5(DIRECTORY_SEPARATOR));JFactory::getConfig();exit";s:19:"cache_name_function";s:6:"assert";s:5:"cache";b:1;s:11:"cache_class";O:20:"JDatabaseDriverMysql":0:{}}i:1;s:4:"init";}}s:13:"\\0\\0\\0connection";b:1;}ðýýý' http://127.0.0.1:5000/ -o/dev/null
* Connected to 127.0.0.1 (127.0.0.1) port 5000 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:5000
> User-Agent: curl/7.73.0
> Accept: */*
> X-Forwarded-For: }__test|O:21:"JDatabaseDriverMysqli":3:{s:2:"fc";O:17:"JSimplepieFactory":0:{}s:21:"\\0\\0\\0disconnectHandlers";a:1:{i:0;a:2:{i:0;O:9:"SimplePie":5:{s:8:"sanitize";O:20:"JDatabaseDriverMysql":0:{}s:8:"feed_url";s:56:"die(md5(DIRECTORY_SEPARATOR));JFactory::getConfig();exit";s:19:"cache_name_function";s:6:"assert";s:5:"cache";b:1;s:11:"cache_class";O:20:"JDatabaseDriverMysql":0:{}}i:1;s:4:"init";}}s:13:"\\0\\0\\0connection";b:1;}ðýýý
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/html; charset=utf-8
< Content-Length: 2484
< Server: Werkzeug/1.0.1 Python/3.7.7
< Date: Thu, 12 Nov 2020 08:56:04 GMT
<

I found an extra link in addition to @euri10 's benoitc/gunicorn#1778 . It's a PR in werkzeug (pallets/werkzeug#1346), so that probably we should do the same

but yeah, from my point of view an extra PR with :

Takes the first value from the list of addresses rather than the last (while there's no actual standard for this header, both MDN and wikipedia list the client always being the first, followed by any other proxies used)

should be fired

@euri10
Copy link
Member

euri10 commented Nov 12, 2020

Looks good, but yeah, from my point of view an extra PR with :

Takes the first value from the list of addresses rather than the last (while there's no actual standard for this header, both MDN and wikipedia list the client always being the first, followed by any other proxies used)

should be fired

you got it here @cdeler
#591

@euri10 euri10 merged commit 45e6e83 into encode:master Nov 12, 2020
@euri10 euri10 mentioned this pull request Nov 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UnicodeDecodeError when decoding bad headers
4 participants