urllib.parse.splituser has no suitable replacement #80072

jaraco · 2019-02-03T15:11:00Z

BPO	35891
Nosy	@jaraco

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-02-03.15:11:00.002>
labels = ['3.8', 'type-bug', 'library']
title = 'urllib.parse.splituser has no suitable replacement'
updated_at = <Date 2019-02-03.15:11:09.508>
user = 'https://github.com/jaraco'

bugs.python.org fields:

activity = <Date 2019-02-03.15:11:09.508>
actor = 'jaraco'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-02-03.15:11:00.002>
creator = 'jaraco'
dependencies = []
files = []
hgrepos = []
issue_num = 35891
keywords = []
message_count = 1.0
messages = ['334793']
nosy_count = 1.0
nosy_names = ['jaraco']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue35891'
versions = ['Python 3.8']

jaraco · 2019-02-03T15:10:59Z

The removal of splituser (bpo-27485) has the undesirable effect of leaving the programmer without a suitable alternative. The deprecation warning states to use urlparse instead, but urlparse doesn't provide the access to the credential or address components of a URL.

Consider for example:

>>> import urllib.parse
>>> url = 'https://user:password@host:port/path'
>>> parsed = urllib.parse.urlparse(url)
>>> urllib.parse.splituser(parsed.netloc)
('user:password', 'host:port')

It's not readily obvious how one might get those two values, the credential and the address, from parsed. Sure, you can get username and password. You can get hostname and port. But if what you want is to remove the credential and keep the address, or extract the credential and pass it unchanged as a single string to something like an _encode_auth handler, that's no longer possible without some careful handling--because of possible None values, re-assembling a username/password into a colon-separated string is more complicated than simply doing a ':'.join.

This recommendation and limitation led to issues in production code and ultimately the inline adoption of the deprecated function, summarized here.

I believe if splituser is to be deprecated, the netloc should provide a suitable alternative - namely that a urlparse result should supply address and userinfo. Such functionality would make it easier to transition code that currently relies on splituser for more than to parse out the username and password.

Even better would be for the urlparse result to support _replace operations on these attributes... so that one wouldn't have to construct a netloc just to construct a URL that replaces only some portion of the netloc, so one could do something like:

>> parsed = urllib.parse.urlparse(url)
>> without_userinfo = parsed._replace(userinfo=None).geturl()
>> alt_port = parsed._replace(port=443).geturl()

I realize that because of the nesting of abstractions (namedtuple for the main parts), that maybe this technique doesn't extend nicely, so maybe the netloc itself should provide this extensibility for a usage something like this:

>> parsed = urllib.parse.urlparse(url)
>> without_userinfo = parsed._replace(netloc=parsed.netloc._replace(userinfo=None)).geturl()
>> alt_port = parsed._replace(netloc=parsed.netloc._replace(port=443)).geturl()

It's not as elegant, but likely simpler to implement, with netloc being extended with a _replace method to support replacing segments of itself (and still immutable)... and is dramatically less error-prone than the status quo without splituser.

In any case, I don't think it's suitable to leave it to the programmer to have to muddle around with their own URL parsing logic. urllib.parse should provide some help here.

Avasam · 2024-10-30T20:25:09Z

_NetlocResultMixinStr (or the proper superclass), could have properties userinfo and hostinfo that are essentially just:

class _NetlocResultMixinStr(...):
    ...
	@property
    def userinfo(self):
		return ":".join([info for info in self._userinfo if info is not None])
	@property
    def hostinfo(self):
		return ":".join([info for info in parsed._hostinfo if info is not None])
		
	# or even
	
    @property
    def userinfo(self):
        if self.username:
            return self.username + (f":{self.password}" if self.password else "")
        return None
    @property
    def hostinfo(self):
        if self.hostname:
            return self.hostname + ("" if self.port is None else f":{self.port}")
        return None

jaraco added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error 3.8 (EOL) end of life labels Feb 3, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

Avasam mentioned this issue Oct 30, 2024

Remove unnecessary code paths for 3.9+ (follow up on skeleton changes) pypa/setuptools#4718

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

urllib.parse.splituser has no suitable replacement #80072

urllib.parse.splituser has no suitable replacement #80072

jaraco commented Feb 3, 2019

jaraco commented Feb 3, 2019

Avasam commented Oct 30, 2024 •

edited

Loading

urllib.parse.splituser has no suitable replacement #80072

urllib.parse.splituser has no suitable replacement #80072

Comments

jaraco commented Feb 3, 2019

jaraco commented Feb 3, 2019

Avasam commented Oct 30, 2024 • edited Loading

Avasam commented Oct 30, 2024 •

edited

Loading