Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce time complexity of URL subtraction #1388

Closed
wants to merge 37 commits into from
Closed

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Oct 25, 2024

Time complexity was roughly O(base_path_segments * target_path_segments)

Time complexity is now roughly O(base_path_segments + target_path_segments)

Copy link

codspeed-hq bot commented Oct 25, 2024

CodSpeed Performance Report

Merging #1388 will improve performances by ×5.7

Comparing url_lazy_no_pathlib (3885557) with master (6a2a90a)

Summary

⚡ 2 improvements
✅ 83 untouched benchmarks

Benchmarks breakdown

Benchmark master url_lazy_no_pathlib Change
test_url_subtract 13.7 ms 2.4 ms ×5.7
test_url_subtract_long_urls 157.6 ms 40.6 ms ×3.9

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.10%. Comparing base (6a2a90a) to head (3885557).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1388      +/-   ##
==========================================
+ Coverage   96.05%   96.10%   +0.04%     
==========================================
  Files          31       31              
  Lines        5784     5822      +38     
  Branches      361      372      +11     
==========================================
+ Hits         5556     5595      +39     
+ Misses        202      201       -1     
  Partials       26       26              
Flag Coverage Δ
CI-GHA 96.10% <100.00%> (+0.04%) ⬆️
MyPy 46.58% <96.61%> (+0.36%) ⬆️
OS-Linux 99.45% <90.74%> (-0.11%) ⬇️
OS-Windows 99.51% <90.74%> (-0.11%) ⬇️
OS-macOS 99.20% <90.74%> (-0.10%) ⬇️
Py-3.10.11 99.18% <90.74%> (-0.10%) ⬇️
Py-3.10.15 99.41% <90.74%> (-0.11%) ⬇️
Py-3.11.10 99.41% <90.74%> (-0.11%) ⬇️
Py-3.11.9 99.18% <90.74%> (-0.10%) ⬇️
Py-3.12.7 99.41% <90.74%> (-0.11%) ⬇️
Py-3.13.0 99.41% <90.74%> (-0.11%) ⬇️
Py-3.9.13 99.09% <87.03%> (-0.15%) ⬇️
Py-3.9.20 99.32% <87.03%> (-0.15%) ⬇️
Py-pypy7.3.16 99.37% <87.03%> (-0.16%) ⬇️
Py-pypy7.3.17 99.44% <90.74%> (-0.11%) ⬇️
VM-macos-latest 99.20% <90.74%> (-0.10%) ⬇️
VM-ubuntu-latest 99.45% <90.74%> (-0.11%) ⬇️
VM-windows-latest 99.51% <90.74%> (-0.11%) ⬇️
pytest 99.45% <90.74%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yarl/_path.py Outdated Show resolved Hide resolved
yarl/_path.py Outdated Show resolved Hide resolved
yarl/_path.py Outdated Show resolved Hide resolved
yarl/_path.py Outdated Show resolved Hide resolved
@bdraco bdraco force-pushed the url_lazy_no_pathlib branch 2 times, most recently from afea6e5 to 9cae1ef Compare October 25, 2024 07:59
tests/test_url.py Outdated Show resolved Hide resolved
@bdraco
Copy link
Member Author

bdraco commented Oct 25, 2024

@commonism Would you please add some xfail tests for this case?

@commonism
Copy link
Contributor

c.f. #1389 (comment)

@bdraco
Copy link
Member Author

bdraco commented Oct 25, 2024

#1390

@bdraco
Copy link
Member Author

bdraco commented Oct 25, 2024

@commonism Would you please check if it looks right now?

@@ -110,7 +113,7 @@ def test_sub(target: str, base: str, expected: str):
(
"http://example.com////path/////to",
"http://example.com/////spam",
"..//path/////to",
"../path/////to",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ("/path/to", "/spam/", "../path/to") as reference, this can't be right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH at this point I'm really not sure how its supposed to work and I'm tempted to revert out the whole feature as its seems like its turning out to be more trouble than its worth

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is right .. I'm not really sure

http://example.com////path/////to
http://example.com/////spam

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or all the tests are wrong....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect it to follow along with the first test case:

…
("/path/to", "/spam/", "../path/to"),
("//path/to", "/spam/", "..//path/to"),
(
    "http://example.com//path/to",
    "http://example.com/spam",
    "..//path/to",
),
(
    "http://example.com////path/////to",
    "http://example.com/////spam",
    "..//path/////to",
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailing / in "spam/" has to be removed.

("/path/to", "/spam", "../path/to"),
("/path/to", "/spam/", "../../path/to"),

And … others to consider

("/path/to", "//spam", "../../path/to"),
(
    "http://example.com/path/to",
    "http://example.com//spam",
    "../../path/to",
),
(
    "http://example.com/////path/////to",
    "http://example.com////spam",
    "../../path/////to",
)

yarl/_path.py Outdated Show resolved Hide resolved
path = path[:-1]
if "." in path:
# Strip '.' segments
parts = [x for x in path.split("/") if x != "."]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripping "." segments is wrong for "/./" which is equal to "//"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe right here … but I remember there was a special case on this.

@bdraco
Copy link
Member Author

bdraco commented Oct 25, 2024

I'm think I'm throwing in the towel on this one and going for a revert. We need to start over

@bdraco
Copy link
Member Author

bdraco commented Oct 25, 2024

@commonism Thanks for catching the issues before we shipped this.

I opened #1391 to revert so we can start over with a new approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants