-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Forwarded, X-Forwarded-Scheme and X-Forwarded-Host for better scheme and host resolution #1881
Conversation
CHANGES.rst
Outdated
@@ -41,6 +41,8 @@ Changes | |||
|
|||
- Do not unquote `+` in match_info values #1816 | |||
|
|||
- Use Forwarded, X-Forwarded-Scheme and X-Forwarded-Host for better scheme and host resolution. #1134 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 here. Please split this to 2 lines or more of up to 79 characters per line. For more information on PEP8 see the www.python.org page on PEP8.
Implementation looks good on fist glaze. |
does c2848ed cover the deprecation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems alright for now.
Ping @asvetlov |
Sorry, I'm on US PyCon right now and quite busy. |
https://tools.ietf.org/html/rfc7239 says that |
aiohttp/web_request.py
Outdated
forwarded = self._message.headers.get(hdrs.FORWARDED) | ||
if forwarded is not None: | ||
parsed = re.findall( | ||
r'^by=([^;]*); +for=([^;]*); +host=([^;]*); +proto=(https?)$', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the parser should be more complicated.
The header field value can be
defined in ABNF syntax as:
Forwarded = 1#forwarded-element
forwarded-element =
[ forwarded-pair ] *( ";" [ forwarded-pair ] )
forwarded-pair = token "=" value
value = token / quoted-string
token = <Defined in [RFC7230], Section 3.2.6>
quoted-string = <Defined in [RFC7230], Section 3.2.6>
It means the forwarded-pair
elements could be in random order: by=a; host=b.com
and host=b.com, by=a
are both valid combinations.
Also value could be optionally quoted, aiohttp shoud unquote it automatically.
pair-name is case-insensitive: all 'Host', 'host', and 'HOst` are valid names:
Forwarded: for="_gazonk"
Forwarded: For="[2001:db8:cafe::17]:4711"
Forwarded: for=192.0.2.60;proto=http;by=203.0.113.43
Forwarded: for=192.0.2.43, for=198.51.100.17
Note that as ":" and "[]" are not valid characters in "token", IPv6
addresses are written as "quoted-string".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note: the lase example contains two for
pairs, it's not supported by your regexp now.
😊 I should have been way more thorough. Sorry about this, I will go
fix. Thanks for looking into it!
…On Sat, May 20, 2017 at 7:59 AM, Andrew Svetlov ***@***.***> wrote:
***@***.**** requested changes on this pull request.
------------------------------
In aiohttp/web_request.py
<#1881 (comment)>:
> @@ -151,15 +151,30 @@ def secure(self):
return self.url.scheme == 'https'
@reify
+ def _forwarded(self):
+ forwarded = self._message.headers.get(hdrs.FORWARDED)
+ if forwarded is not None:
+ parsed = re.findall(
+ r'^by=([^;]*); +for=([^;]*); +host=([^;]*); +proto=(https?)$',
I suspect the parser should be more complicated.
The header field value can be
defined in ABNF syntax as:
Forwarded = 1#forwarded-element
forwarded-element =
[ forwarded-pair ] *( ";" [ forwarded-pair ] )
forwarded-pair = token "=" value
value = token / quoted-string
token = <Defined in [RFC7230], Section 3.2.6>
quoted-string = <Defined in [RFC7230], Section 3.2.6>
It means the forwarded-pair elements could be in random order: by=a; host=
b.com and host=b.com, by=a are both valid combinations.
Also value could be optionally quoted, aiohttp shoud unquote it
automatically.
pair-name is case-insensitive: all 'Host', 'host', and 'HOst` are valid
names:
Forwarded: for="_gazonk"
Forwarded: For="[2001:db8:cafe::17]:4711"
Forwarded: for=192.0.2.60;proto=http;by=203.0.113.43
Forwarded: for=192.0.2.43, for=198.51.100.17
Note that as ":" and "[]" are not valid characters in "token", IPv6
addresses are written as "quoted-string".
------------------------------
In aiohttp/web_request.py
<#1881 (comment)>:
> @@ -151,15 +151,30 @@ def secure(self):
return self.url.scheme == 'https'
@reify
+ def _forwarded(self):
+ forwarded = self._message.headers.get(hdrs.FORWARDED)
+ if forwarded is not None:
+ parsed = re.findall(
+ r'^by=([^;]*); +for=([^;]*); +host=([^;]*); +proto=(https?)$',
Please note: the lase example contains two for pairs, it's not supported
by your regexp now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1881 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACjx1nCRcX1AMTWrK0Ev9NY-YbwoSrxfks5r7oFTgaJpZM4NYgE5>
.
|
Alright, this actually adds a public
Don't you think the place for parsing the Forwarded header is rather a middleware? I'm in doubt myself now. Otherwise, what's a good place for the regex constants? Would be nice to compile the regex only once, but module level doesn't do much for esthetics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love .req.forwarded
property.
No, we don't need a middleware for it.
Global variables for regexps are pretty fine. If you don't like this approach you could put them ad class variables into BaseRequest
aiohttp/web_request.py
Outdated
if hdrs.FORWARDED in self._message.headers: | ||
for forwarded_elm in self._message.headers.getall(hdrs.FORWARDED): | ||
if len(forwarded_elm): | ||
forwarded_pairs = tuple( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't create a tuple but make a generator comprehension. It saves both time and memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please support Forwarded: by=first, by=second
form, it's equal to
Forwarded: by=first
Forwarded: by=second
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is supported by the regexes and I've included tests for it
aiohttp/web_request.py
Outdated
params = {'by': [], 'for': [], 'host': [], 'proto': []} | ||
if hdrs.FORWARDED in self._message.headers: | ||
for forwarded_elm in self._message.headers.getall(hdrs.FORWARDED): | ||
if len(forwarded_elm): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop this check, iteration over empty forwarded_elm
is pretty fine.
aiohttp/web_request.py
Outdated
proto=tuple(...), ) | ||
""" | ||
params = {'by': [], 'for': [], 'host': [], 'proto': []} | ||
if hdrs.FORWARDED in self._message.headers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the check, in
is not free.
Use for forwarded_elm in self._message.headers.getall(hdrs.FORWARDED, [])
instead
aiohttp/web_request.py
Outdated
r'[bB][yY]|[fF][oO][rR]|[hH][oO][sS][tT]|[pP][rR][oO][tT][oO]') | ||
|
||
_FORWARDED_PAIR = ( | ||
r'^ *({forwarded_params})=({token}|{quoted_string}) *$'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC spaces around =
are allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure but as far as I can see that's not the case. Section 4 only says forwarded-pair = token "=" value
and section 7 recalls Note that an HTTP list allows white spaces to occur between the identifiers, and the list may be split over multiple header fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it's rare case but still allowed
aiohttp/web_request.py
Outdated
continue | ||
param = forwarded_pair[0][0].lower() | ||
value = forwarded_pair[0][1] | ||
if len(value) and value[0] == '"': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check for value[-1] == '"'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If empty value allowed? I mean len(value) == 0
.
Also if value:
gives the same result as if len(value)
but works a bit faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to check for a closing quote: the regex validates quoted-string
so any value starting with "
must also have a closing "
.
A token
may be empty (but note again that I don't check the syntax in section 6)
aiohttp/web_request.py
Outdated
Returns a dict(by=tuple(...), for=tuple(...), host=tuple(...), | ||
proto=tuple(...), ) | ||
""" | ||
params = {'by': [], 'for': [], 'host': [], 'proto': []} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the property's type should be MultyDict
instead of dict of lists.
It preserves all information about proxies in right order but original host still could be checked as req.forwarded.get('host')
aiohttp/web_request.py
Outdated
value = _QUOTED_PAIR_REPLACE_RE.sub( | ||
r'\1', value[1:-1]) | ||
params[param].append(value) | ||
return params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return MultiDictProxy(params)
-- it's an immutable view.
aiohttp/web_request.py
Outdated
Returns a dict(by=tuple(...), for=tuple(...), host=tuple(...), | ||
proto=tuple(...), ) | ||
""" | ||
params = {'by': [], 'for': [], 'host': [], 'proto': []} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use MultiDict
here instead of dict of pairs. It preserves all information about proxies.
aiohttp/web_request.py
Outdated
""" | ||
params = {'by': [], 'for': [], 'host': [], 'proto': []} | ||
if hdrs.FORWARDED in self._message.headers: | ||
for forwarded_elm in self._message.headers.getall(hdrs.FORWARDED): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop previous line, in
is not free but getall(hdrs.FORWARDED, ())
does the check using single call to data structure.
I forgot Python < 3.5 doesn't like an arbitrary number of unpackings ( |
Yes, we sill support Python 3.4.2+ |
I seem to have missed a couple of things:
Regarding OWS around I looked for examples of whitespace around I'll do my best to add some more commits later this week. When are you planning the next release, @asvetlov? |
Ok, I'm convinced regarding to spaces. |
…r instead of tchar), changed return type of forwarded
@asvetlov it is your call. I am busy with different project in any case |
d1ba86b fixes the last issues, as far as I can see.
As far as the return type goes, I've changed that to a tuple containing MappingProxies. The semantics of the Forwarded header are so that every proxy may add a Forwarding field-value;
If a field-value added by a proxy cannot be parsed, the code will add a Let me know what you think, @asvetlov. |
Awesome work @evertlammerts |
For anyone else chasing their tail trying to figure out where this feature has gone, it has been removed and moved into middleware. See here: |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs. |
What do these changes do?
They enhance the resolution mechanism for scheme and host in
aiohttp.web_request.Baserequest
.The scheme is resolved based on:
secure_proxy_ssl_header
argument ofBaseRequest.__init__
Forwarded
headerX-Forwarded-Proto
headerin that order.
The host is resolved based on:
Forwarded
headerX-Forwarded-Host
headerHost
headerin that order.
Are there changes in behavior for the user?
This PR does not include changes in the contract.
Related issue number
#1134 Support X-Forwarded-* and Forwarded implicitly, deprecate secure_proxy_ssl_header.
Checklist
CONTRIBUTORS.txt
CHANGES.rst
#issue_number
format at the end of changelog message. Use Pull Request number if there are no issues for PR or PR covers the issue only partially.