Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests to test_invalid_url for InvalidSchema #2222

Merged
merged 1 commit into from
Sep 12, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions test_requests.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
from requests.compat import (
Morsel, cookielib, getproxies, str, urljoin, urlparse, is_py3, builtin_str)
from requests.cookies import cookiejar_from_dict, morsel_to_cookie
from requests.exceptions import (InvalidURL, MissingSchema, ConnectTimeout,
ReadTimeout, ConnectionError, Timeout)
from requests.exceptions import (ConnectionError, ConnectTimeout,
InvalidSchema, InvalidURL, MissingSchema,
ReadTimeout, Timeout)
from requests.models import PreparedRequest
from requests.structures import CaseInsensitiveDict
from requests.sessions import SessionRedirectMixin
Expand Down Expand Up @@ -78,6 +79,12 @@ def test_entry_points(self):
def test_invalid_url(self):
with pytest.raises(MissingSchema):
requests.get('hiwpefhipowhefopw')
with pytest.raises(InvalidSchema):
requests.get('localhost:3128')
with pytest.raises(InvalidSchema):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and the test below are exactly the same test case since urlparse parses them in the exact same way:

>>> urlparse.urlparse('localhost.localdomain:3128/')
ParseResult(scheme='localhost.localdomain', netloc='', path='3128/', params='', query='', fragment='')
>>> urlparse.urlparse('10.122.1.1:3128/')
ParseResult(scheme='10.122.1.1', netloc='', path='3128/', params='', query='', fragment='')

They should be consolidated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sigmavirus24
I would argue that this is an implementation detail, and the test should be independent of this, and it's better to have more tests than less.
It might happen that the test for skipping wouldn't catch IP addresses, but only domains.

For example, given an IP address and/or the dots in the "scheme", it would be possible to not use this as a scheme. These tests are meant to keep this behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are still functionally equivalent. They're not an implementation detail because as I mentioned, RFC 3986 has no way of identifying that the part before the : here is not a scheme. So any specification compliant implementation will do this, ergo it's a specification detail that makes these tests functionally equivalent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I've meant is that this tests the skipping code: if this was changed, e.g. by using a more sophisticated regex, the behavior might change and this additional test might catch it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have a regular expression that does this. We have 3 options as I see them:

  1. continue to rely on urlparse from the standard library
  2. use urllib3's URL parser
  3. use rfc3986's URI parser

They all, to varying degrees, follow the specification and will have very similar, if not exactly the same behaviour. We are far more likely to rely on third party libraries that do things efficiently to regular expressions that we put together ourselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referring to this code (https://github.com/kennethreitz/requests/blob/master/requests/models.py#L337-L342):

    if ':' in url and not url.lower().startswith('http'):
        self.url = url
        return

If this would get changed, an IP address might get handled different from a hostname.

Here btw urllib3s URL parser is being used further down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, forgive me @blueyed. I was mistaking the discussion here for one of the other issues you've filed recently.

requests.get('localhost.localdomain:3128/')
with pytest.raises(InvalidSchema):
requests.get('10.122.1.1:3128/')
with pytest.raises(InvalidURL):
requests.get('http://')

Expand Down