-
-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent Handling of Invalid Urls #1832
Comments
Thanks @cancan101 - Appreciate the neatly summarised issue. |
So... Good place to start with something like this is to unpick it into the smallest possible components, in order to figure out exactly what behaviour we want, and that is consistent with the API throughout. Here's where I got to after some first steps... >>> u = httpx.URL('https://😇') # Raises `InvalidURL`. And in contrast... >>> u = httpx.URL('https:/google.com')
>>> u.scheme
'https'
>>> u.host
''
>>> u.path
'/google.com' Now. That's not necessarily undesired behaviour at this point. We might consider the first to be a strictly invalid URL, and the second to be a valid URL, that just happens to be missing a hostname. We can also confirm that URLs instantiated with explicit portions end up the same way here... >>> u = httpx.URL(scheme='https', path='/google.com')
URL('https:/google.com') So at this level of the API we are at least consistent. I'm somewhat surprised tho, at why that results in an Let's take a closer look... >>> c = httpx.Client()
>>> r = c.build_request('GET', 'https:/google.com')
>>> r.url
URL('/google.com') Hrm. Somethings changed here once we've started |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Still valid thx, @Stale bot. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@tomchristie Lines 341 to 350 in 5b06aea
In this case, since u.host is '' , it causes this check to return false and so it is treated as a relative URL.This means that _merge_url tries to prepend self.base_url to the 'relative' URL, but since it's intended to be an absolute URL, there is no base_url provided and so the scheme has now been removed. Since there is no longer a scheme, we receive UnsupportedProtocol :Lines 369 to 389 in 5b06aea
Is it sufficient to just check in |
If I call:
httpx.get('https://😇')
, anInvalidURL
exception is raised. However, if I call:httpx.get('https:/google.com')
, instead I get anUnsupportedProtocol
exception, which seems inconsistent. I would expect the latter to raise anInvalidURL
as well.For reference, requests raises a
InvalidURL
in both cases.This issue also exists for
requests.get('https:///google.com')
vshttpx.get('https:///google.com')
.The text was updated successfully, but these errors were encountered: