Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using the subtraction operator to get the relative path between URLs #1340

Merged
merged 27 commits into from
Oct 22, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c27d7d4
Support using the subtraction operator to get the relative path betwe…
oleksbabieiev Oct 20, 2024
fe047cc
Add CHANGES/1340.feature.rst
oleksbabieiev Oct 20, 2024
81ed86d
Merge branch 'master' into relpath
bdraco Oct 21, 2024
f18ed6b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2024
97820e5
Merge branch 'master' into relpath
bdraco Oct 21, 2024
d13bf1d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2024
5037e6b
fix conflicting imports
bdraco Oct 21, 2024
6478fde
Merge branch 'master' into relpath
bdraco Oct 21, 2024
3665a02
Merge branch 'master' into relpath
oleksbabieiev Oct 21, 2024
3add068
Sign CHANGES/1340.feature.rst
oleksbabieiev Oct 21, 2024
38c7f3c
Move `relative_path()` to `_path.py`
oleksbabieiev Oct 21, 2024
4dd4a69
Add more parameters to test `URL.__sub__()`
oleksbabieiev Oct 21, 2024
f5f242b
Use `SEPARATOR` constant for `/`
oleksbabieiev Oct 21, 2024
2006066
Disallow `relative_path()` between abs and rel paths
oleksbabieiev Oct 21, 2024
c717845
Remove the `SEPARATOR` constant
oleksbabieiev Oct 21, 2024
827171f
Rename `relative_path()`
oleksbabieiev Oct 21, 2024
b436b14
Refactor `test_url.py`
oleksbabieiev Oct 21, 2024
1b620df
Refactor `_path.py`
oleksbabieiev Oct 21, 2024
bfb2c4d
Add a small demo to `1340.feature.rst`
oleksbabieiev Oct 21, 2024
2b87478
Update docs for `URL.__sub__()`
oleksbabieiev Oct 21, 2024
05c2147
Add a PEP 257-compliant docstring for `URL.__sub__()`
oleksbabieiev Oct 21, 2024
42c75aa
Update CHANGES/1340.feature.rst
oleksbabieiev Oct 21, 2024
7da1599
Avoid the `os` namespace
oleksbabieiev Oct 22, 2024
e252ff7
Introduce the `offset` variable
oleksbabieiev Oct 22, 2024
1bebdd0
Merge branch 'master' into relpath
oleksbabieiev Oct 22, 2024
39060d1
Replace `PurePath` with `PurePosixPath`
oleksbabieiev Oct 22, 2024
d683e16
Refactor `_path.py`
oleksbabieiev Oct 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions CHANGES/1340.feature.rst
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Added support for using the :meth:`subtraction operator <yarl.URL.__sub__>`
to get the relative path between URLs.
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved

Note that both URLs must have the same scheme, user, password, host and port:

.. code-block:: pycon

>>> target = URL("http://example.com/path/index.html")
>>> base = URL("http://example.com/")
>>> target - base
URL('path/index.html')

URLs can also be relative:

.. code-block:: pycon

>>> target = URL("/")
>>> base = URL("/path/index.html")
>>> target - base
URL('..')

-- by :user:`oleksbabieiev`.
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
15 changes: 15 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -974,6 +974,21 @@ The path is encoded if needed.
>>> base.join(URL('//python.org/page.html'))
URL('http://python.org/page.html')

The subtraction (``-``) operator creates a new URL with
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
a relative *path* to the target URL from the given base URL.
*scheme*, *user*, *password*, *host*, *port*, *query* and *fragment* are removed.

.. method:: URL.__sub__(url)

Returns a new URL with a relative *path* between two other URL objects.

.. doctest::

>>> target = URL('http://example.com/path/index.html')
>>> base = URL('http://example.com/')
>>> target - base
URL('path/index.html')
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved

Human readable representation
-----------------------------

Expand Down
47 changes: 47 additions & 0 deletions tests/test_url.py
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,53 @@ def test_str():
assert str(url) == "http://example.com:8888/path/to?a=1&b=2"


@pytest.mark.parametrize(
("target", "base", "expected"),
[
("http://example.com/path/to", "http://example.com/", "path/to"),
("http://example.com/path/to", "http://example.com/spam", "path/to"),
("http://example.com/path/to", "http://example.com/spam/", "../path/to"),
("http://example.com/path", "http://example.com/path/to/", ".."),
("http://example.com/", "http://example.com/", "."),
("http://example.com", "http://example.com", "."),
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
("http://example.com/", "http://example.com", "."),
("http://example.com", "http://example.com/", "."),
("//example.com", "//example.com", "."),
("/path/to", "/spam/", "../path/to"),
("path/to", "spam/", "../path/to"),
("path/to", "spam", "path/to"),
("..", ".", ".."),
(".", "..", "."),
],
)
def test_sub(target: str, base: str, expected: str):
assert URL(target) - URL(base) == URL(expected)


def test_sub_with_different_schemes():
expected_error_msg = "Both URLs should have the same scheme"
with pytest.raises(ValueError, match=expected_error_msg):
URL("http://example.com/") - URL("https://example.com/")


def test_sub_with_different_netlocs():
expected_error_msg = "Both URLs should have the same netloc"
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
with pytest.raises(ValueError, match=expected_error_msg):
URL("https://spam.com/") - URL("https://ham.com/")


def test_sub_with_different_anchors():
expected_error_msg = "'path/to' and '/path' have different anchors"
with pytest.raises(ValueError, match=expected_error_msg):
URL("path/to") - URL("/path/from")


def test_sub_with_two_dots_in_base():
expected_error_msg = "'..' segment in '/path/..' cannot be walked"
with pytest.raises(ValueError, match=expected_error_msg):
URL("path/to") - URL("/path/../from")

oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved

def test_repr():
url = URL("http://example.com")
assert "URL('http://example.com')" == repr(url)
Expand Down
30 changes: 30 additions & 0 deletions yarl/_path.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

from collections.abc import Sequence
from contextlib import suppress
from pathlib import PurePosixPath


def normalize_path_segments(segments: Sequence[str]) -> list[str]:
Expand Down Expand Up @@ -39,3 +40,32 @@ def normalize_path(path: str) -> str:

segments = path.split("/")
return prefix + "/".join(normalize_path_segments(segments))


def calculate_relative_path(target: str, base: str) -> str:
"""Return the relative path between two other paths.

If the operation is not possible, raise ValueError.
"""

target = target or "/"
base = base or "/"

target_path = PurePosixPath(target)
base_path = PurePosixPath(base)

if not base.endswith("/"):
base_path = base_path.parent

for step, path in enumerate([base_path] + list(base_path.parents)):
bdraco marked this conversation as resolved.
Show resolved Hide resolved
if target_path.is_relative_to(path):
break
elif path.name == "..":
raise ValueError(f"'..' segment in {str(base_path)!r} cannot be walked")
else:
raise ValueError(
f"{str(target_path)!r} and {str(base_path)!r} have different anchors"
)
offset = len(path.parts)
parts = [".."] * step + list(target_path.parts)[offset:]
return str(PurePosixPath(*parts))
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
30 changes: 29 additions & 1 deletion yarl/_url.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from propcache.api import under_cached_property as cached_property

from ._parse import USES_AUTHORITY, make_netloc, split_netloc, split_url, unsplit_result
from ._path import normalize_path, normalize_path_segments
from ._path import calculate_relative_path, normalize_path, normalize_path_segments
from ._query import (
Query,
QueryVariable,
Expand Down Expand Up @@ -476,6 +476,34 @@ def __truediv__(self, name: str) -> "URL":
return NotImplemented
return self._make_child((str(name),))

def __sub__(self, other: object) -> "URL":
oleksbabieiev marked this conversation as resolved.
Show resolved Hide resolved
"""Return a new URL with a relative path between two other URL objects.

Note that both URLs must have the same scheme and netloc.
The new relative URL has only path:
scheme, user, password, host, port, query and fragment are removed.

Example:
>>> target = URL("http://example.com/path/index.html")
>>> base = URL("http://example.com/")
>>> target - base
URL('path/index.html')
"""

if type(other) is not URL:
return NotImplemented

target = self._val
base = other._val

if target.scheme != base.scheme:
raise ValueError("Both URLs should have the same scheme")
if target.netloc != base.netloc:
raise ValueError("Both URLs should have the same netloc")
bdraco marked this conversation as resolved.
Show resolved Hide resolved

path = calculate_relative_path(target.path, base.path)
return self._from_tup(("", "", path, "", ""))

def __mod__(self, query: Query) -> "URL":
return self.update_query(query)

Expand Down