-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests to test_invalid_url for InvalidSchema
#2222
Conversation
I have created urllib3/urllib3#466 to possibly handle the "oddball schemes" in urllib3 directly. EDIT: not their job. |
@jvantuyl |
@blueyed I elaborated on this change in the issue you opened on shazow/urllib3. This allows for third-party developers to create adapters using the |
Basically, it makes requests extensible with other protocols. It doesn't add support, but it makes it at least possible. Two implementations (which I haven't tested in ages): I originally used them for testing and ad-hoc usage of some CLI tools I built for deployment. |
Also I'm fairly certain it allowed @asmeurer to use the file protocol in conda/conda. |
And, syntactically, "localhost:8000" does parse as a URL. It just uses the unknown scheme localhost. Interestingly, that's exactly the error you're getting (although as an AssertionError, which is probably bad form). Perhaps you could replace that assertion with a different error (say UnrecognizedScheme), check for it, then implement the fallback behavior? |
I see. So it's probably expected and sane to get a I've thought about adding a regex that would let it fall through, but it doesn't change much in the end. The expected outcome should get added as test however.
I do not understand this. It appears to be a proper exception already ( I came to this issue through pip and the handling of its proxy setting initially (IIRC, and unrelated). |
You are correct. I was just responding to the raising of AssertionError with a made-up error. I forgot that such an exception existed. Looks like that's been noted as #2222. |
conda uses https://github.com/conda/conda/blob/e082781cad83e0bc6a41a2870b605f4ee08bbd4d/conda/connection.py#L74-L120. I think I took it from pip. |
My problems centered around urllib.util.url.parse_url doing some interesting splitting when certain characters were involved and then further mangling inside of requests.models.prepare_url. If you have the wrong characters in your URL, interesting things can happen (and all of them are pretty clearly incorrect). I suspect that those same issues wouldn't be any better for anybody else if someone has exciting and interesting file paths. :/ |
3921bac
to
c2486a0
Compare
I've changed the PR to add more tests, which document/test the current behavior. |
localhost:port
.InvalidSchema
This adds tests for the behavior introduced in b149be5, where `PreparedRequest` was made to skip `parse_url` for e.g. `localhost:3128/`.
c2486a0
to
d3566ee
Compare
Suits me. 🍰 |
Add tests to test_invalid_url for `InvalidSchema`
@@ -78,6 +79,12 @@ def test_entry_points(self): | |||
def test_invalid_url(self): | |||
with pytest.raises(MissingSchema): | |||
requests.get('hiwpefhipowhefopw') | |||
with pytest.raises(InvalidSchema): | |||
requests.get('localhost:3128') | |||
with pytest.raises(InvalidSchema): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test and the test below are exactly the same test case since urlparse
parses them in the exact same way:
>>> urlparse.urlparse('localhost.localdomain:3128/')
ParseResult(scheme='localhost.localdomain', netloc='', path='3128/', params='', query='', fragment='')
>>> urlparse.urlparse('10.122.1.1:3128/')
ParseResult(scheme='10.122.1.1', netloc='', path='3128/', params='', query='', fragment='')
They should be consolidated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sigmavirus24
I would argue that this is an implementation detail, and the test should be independent of this, and it's better to have more tests than less.
It might happen that the test for skipping wouldn't catch IP addresses, but only domains.
For example, given an IP address and/or the dots in the "scheme", it would be possible to not use this as a scheme. These tests are meant to keep this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are still functionally equivalent. They're not an implementation detail because as I mentioned, RFC 3986 has no way of identifying that the part before the :
here is not a scheme. So any specification compliant implementation will do this, ergo it's a specification detail that makes these tests functionally equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I've meant is that this tests the skipping code: if this was changed, e.g. by using a more sophisticated regex, the behavior might change and this additional test might catch it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently have a regular expression that does this. We have 3 options as I see them:
- continue to rely on
urlparse
from the standard library - use
urllib3
's URL parser - use
rfc3986
's URI parser
They all, to varying degrees, follow the specification and will have very similar, if not exactly the same behaviour. We are far more likely to rely on third party libraries that do things efficiently to regular expressions that we put together ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am referring to this code (https://github.com/kennethreitz/requests/blob/master/requests/models.py#L337-L342):
if ':' in url and not url.lower().startswith('http'):
self.url = url
return
If this would get changed, an IP address might get handled different from a hostname.
Here btw urllib3
s URL parser is being used further down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, forgive me @blueyed. I was mistaking the discussion here for one of the other issues you've filed recently.
We haven't come across that issue in conda, but assumedly the fix is to first |
Document skipping in PreparedRequest; followup to #2222
In b149be5
PreparedRequest
was made to skipparse_url
for e.g.$HOST:$PORT
, which results inMissingSchema
not being raised forlocalhost:3128
.