-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deoptimise pathlib ABCs #113528
Labels
Comments
barneygale
added
type-feature
A feature request or enhancement
topic-pathlib
3.13
bugs and security fixes
labels
Dec 28, 2023
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
Replace use of `_from_parsed_parts()` with `with_segments()` in `PurePathBase.relative_to()`, and move the assignment of `_drv`, `_root`, `_tail_cached` and `_str` slots into `PurePath.relative_to()`. Also shuffle a deprecation warning into `PurePath.relative_to()` to ensure it's raised at the right stack level, and do the same in `is_relative_to()` for code consistency.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
Replace use of `_from_parsed_parts()` with `with_segments()`, and move assignments to `_drv`, `_root`, _tail_cached` and `_str` slots into `PurePath`.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
Replace usage of `_from_parsed_parts()` with `with_segments()` in `with_name()`, and take a similar approach in `name` for consistency's sake.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
…h()` Call straight through to `joinpath()` in `PathBase._make_child_relpath()`. Move optimised/caching code to `pathlib.Path._make_child_relpath()`
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
Run expensive tests for walking and globbing from `test_pathlib` but not `test_pathlib_abc`. The ABCs are not as tightly optimised as the classes in top-level `pathlib`, and so these tests are taking rather a long time on some buildbots. Coverage of the main `pathlib` classes should suffice.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
Run expensive tests for walking and globbing from `test_pathlib` but not `test_pathlib_abc`. The ABCs are not as tightly optimised as the classes in top-level `pathlib`, and so these tests are taking rather a long time on some buildbots. Coverage of the main `pathlib` classes should suffice.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 28, 2023
barneygale
added a commit
that referenced
this issue
Dec 28, 2023
Run expensive tests for walking and globbing from `test_pathlib` but not `test_pathlib_abc`. The ABCs are not as tightly optimised as the classes in top-level `pathlib`, and so these tests are taking rather a long time on some buildbots. Coverage of the main `pathlib` classes should suffice.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 29, 2023
Move `_str` cache slot from `PurePathBase` to `PurePath`. As a result, pathlib ABCs no longer cache their string representations.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Dec 29, 2023
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 4, 2024
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 5, 2024
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 5, 2024
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 5, 2024
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 5, 2024
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 6, 2024
Split test cases for invalid names into dedicated test methods. This will make it easier to refactor tests for invalid name handling in ABCs later. No change of coverage, just a change of test suite organisation.
barneygale
added a commit
to barneygale/cpython
that referenced
this issue
Jan 6, 2024
The `DummyPurePath` and `DummyPath` test classes are simple subclasses of `PurePathBase` and `Pathbase`. This commit adds `__repr__()` methods to the dummy classes, which makes debugging test failures less painful.
kulikjak
pushed a commit
to kulikjak/cpython
that referenced
this issue
Jan 22, 2024
… (again) (python#113882) Restore full battle-tested implementations of `PurePath.[is_]relative_to()`. These were recently split up in 3375dfe and a15a773. In `PurePathBase`, add entirely new implementations based on `_stack`, which itself calls `pathmod.split()` repeatedly to disassemble a path. These new implementations preserve features like trailing slashes where possible, while still observing that a `..` segment cannot be added to traverse an empty or `.` segment in *walk_up* mode. They do not rely on `parents` nor `__eq__()`, nor do they spin up temporary path objects. Unfortunately calling `pathmod.relpath()` isn't an option, as it calls `abspath()` and in turn `os.getcwd()`, which is impure.
kulikjak
pushed a commit
to kulikjak/cpython
that referenced
this issue
Jan 22, 2024
Apply pathlib's normalization and performance tuning in `pathlib.PurePath`, but not `pathlib._abc.PurePathBase`. With this change, the pathlib ABCs do not normalize away alternate path separators, empty segments, or dot segments. A single string given to the initialiser will round-trip by default, i.e. `str(PurePathBase(my_string)) == my_string`. Implementors can set their own path domain-specific normalization scheme by overriding `__str__()` Eliminating path normalization makes maintaining and caching the path's parts and string representation both optional and not very useful, so this commit moves the `_drv`, `_root`, `_tail_cached` and `_str` slots from `PurePathBase` to `PurePath`. Only `_raw_paths` and `_resolving` slots remain in `PurePathBase`. This frees the ABCs from the burden of some of pathlib's hardest-to-understand code.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…on#113534) Run expensive tests for walking and globbing from `test_pathlib` but not `test_pathlib_abc`. The ABCs are not as tightly optimised as the classes in top-level `pathlib`, and so these tests are taking rather a long time on some buildbots. Coverage of the main `pathlib` classes should suffice.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…ython#113777) The `DummyPurePath` and `DummyPath` test classes are simple subclasses of `PurePathBase` and `PathBase`. This commit adds `__repr__()` methods to the dummy classes, which makes debugging test failures less painful.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…n#113776) Split test cases for invalid names into dedicated test methods. This will make it easier to refactor tests for invalid name handling in ABCs later. No change of coverage, just a change of test suite organisation.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…link loop handling (python#113763) Slightly improve `pathlib.Path.glob()` tests for symlink loop handling When filtering results, ignore paths with more than one `linkD/` segment, rather than all paths below the first `linkD/` segment. This allows us to test that other paths under `linkD/` are correctly returned.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…113531) Replace usage of `_from_parsed_parts()` with `with_segments()` in `with_name()`, and take a similar approach in `name` for consistency's sake.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…n#113530) Replace use of `_from_parsed_parts()` with `with_segments()`, and move assignments to `_drv`, `_root`, _tail_cached` and `_str` slots into `PurePath`.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…python#113529) Replace use of `_from_parsed_parts()` with `with_segments()` in `PurePathBase.relative_to()`, and move the assignment of `_drv`, `_root` and `_tail_cached` slots into `PurePath.relative_to()`.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
`PurePathBase` does not define `__eq__()`, and so we have no business checking path equality in `test_eq_common` and `test_equivalences`. The tests only pass at the moment because we define the test class's `__eq__()` for use elsewhere. Also move `test_parse_path_common` into the main pathlib test suite. It exercises a private `_parse_path()` method that will be moved to `PurePath` soon. Lastly move a couple more tests concerned with optimisations and path normalisation.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
- Add `__slots__` to dummy path classes. - Return namedtuple rather than `os.stat_result` from `DummyPath.stat()`. - Reduce maximum symlink count in `DummyPathWithSymlinks.resolve()`.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…h()` (python#113532) Call straight through to `joinpath()` in `PathBase._make_child_relpath()`. Move optimised/caching code to `pathlib.Path._make_child_relpath()`
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…#113782) Replace use of `_from_parsed_parts()` with `with_segments()` in `resolve()`. No effect on `Path.resolve()`, which uses `os.path.realpath()`.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
…#113883) Implement `parts` using `_stack`, which itself calls `pathmod.split()` repeatedly. This avoids use of `_tail`, which will be moved to `PurePath` shortly.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
… (again) (python#113882) Restore full battle-tested implementations of `PurePath.[is_]relative_to()`. These were recently split up in 3375dfe and a15a773. In `PurePathBase`, add entirely new implementations based on `_stack`, which itself calls `pathmod.split()` repeatedly to disassemble a path. These new implementations preserve features like trailing slashes where possible, while still observing that a `..` segment cannot be added to traverse an empty or `.` segment in *walk_up* mode. They do not rely on `parents` nor `__eq__()`, nor do they spin up temporary path objects. Unfortunately calling `pathmod.relpath()` isn't an option, as it calls `abspath()` and in turn `os.getcwd()`, which is impure.
aisk
pushed a commit
to aisk/cpython
that referenced
this issue
Feb 11, 2024
Apply pathlib's normalization and performance tuning in `pathlib.PurePath`, but not `pathlib._abc.PurePathBase`. With this change, the pathlib ABCs do not normalize away alternate path separators, empty segments, or dot segments. A single string given to the initialiser will round-trip by default, i.e. `str(PurePathBase(my_string)) == my_string`. Implementors can set their own path domain-specific normalization scheme by overriding `__str__()` Eliminating path normalization makes maintaining and caching the path's parts and string representation both optional and not very useful, so this commit moves the `_drv`, `_root`, `_tail_cached` and `_str` slots from `PurePathBase` to `PurePath`. Only `_raw_paths` and `_resolving` slots remain in `PurePathBase`. This frees the ABCs from the burden of some of pathlib's hardest-to-understand code.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…on#113534) Run expensive tests for walking and globbing from `test_pathlib` but not `test_pathlib_abc`. The ABCs are not as tightly optimised as the classes in top-level `pathlib`, and so these tests are taking rather a long time on some buildbots. Coverage of the main `pathlib` classes should suffice.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…ython#113777) The `DummyPurePath` and `DummyPath` test classes are simple subclasses of `PurePathBase` and `PathBase`. This commit adds `__repr__()` methods to the dummy classes, which makes debugging test failures less painful.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…n#113776) Split test cases for invalid names into dedicated test methods. This will make it easier to refactor tests for invalid name handling in ABCs later. No change of coverage, just a change of test suite organisation.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…link loop handling (python#113763) Slightly improve `pathlib.Path.glob()` tests for symlink loop handling When filtering results, ignore paths with more than one `linkD/` segment, rather than all paths below the first `linkD/` segment. This allows us to test that other paths under `linkD/` are correctly returned.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…113531) Replace usage of `_from_parsed_parts()` with `with_segments()` in `with_name()`, and take a similar approach in `name` for consistency's sake.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…n#113530) Replace use of `_from_parsed_parts()` with `with_segments()`, and move assignments to `_drv`, `_root`, _tail_cached` and `_str` slots into `PurePath`.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…python#113529) Replace use of `_from_parsed_parts()` with `with_segments()` in `PurePathBase.relative_to()`, and move the assignment of `_drv`, `_root` and `_tail_cached` slots into `PurePath.relative_to()`.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
`PurePathBase` does not define `__eq__()`, and so we have no business checking path equality in `test_eq_common` and `test_equivalences`. The tests only pass at the moment because we define the test class's `__eq__()` for use elsewhere. Also move `test_parse_path_common` into the main pathlib test suite. It exercises a private `_parse_path()` method that will be moved to `PurePath` soon. Lastly move a couple more tests concerned with optimisations and path normalisation.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
- Add `__slots__` to dummy path classes. - Return namedtuple rather than `os.stat_result` from `DummyPath.stat()`. - Reduce maximum symlink count in `DummyPathWithSymlinks.resolve()`.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…h()` (python#113532) Call straight through to `joinpath()` in `PathBase._make_child_relpath()`. Move optimised/caching code to `pathlib.Path._make_child_relpath()`
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…#113782) Replace use of `_from_parsed_parts()` with `with_segments()` in `resolve()`. No effect on `Path.resolve()`, which uses `os.path.realpath()`.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
…#113883) Implement `parts` using `_stack`, which itself calls `pathmod.split()` repeatedly. This avoids use of `_tail`, which will be moved to `PurePath` shortly.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
… (again) (python#113882) Restore full battle-tested implementations of `PurePath.[is_]relative_to()`. These were recently split up in 3375dfe and a15a773. In `PurePathBase`, add entirely new implementations based on `_stack`, which itself calls `pathmod.split()` repeatedly to disassemble a path. These new implementations preserve features like trailing slashes where possible, while still observing that a `..` segment cannot be added to traverse an empty or `.` segment in *walk_up* mode. They do not rely on `parents` nor `__eq__()`, nor do they spin up temporary path objects. Unfortunately calling `pathmod.relpath()` isn't an option, as it calls `abspath()` and in turn `os.getcwd()`, which is impure.
Glyphack
pushed a commit
to Glyphack/cpython
that referenced
this issue
Sep 2, 2024
Apply pathlib's normalization and performance tuning in `pathlib.PurePath`, but not `pathlib._abc.PurePathBase`. With this change, the pathlib ABCs do not normalize away alternate path separators, empty segments, or dot segments. A single string given to the initialiser will round-trip by default, i.e. `str(PurePathBase(my_string)) == my_string`. Implementors can set their own path domain-specific normalization scheme by overriding `__str__()` Eliminating path normalization makes maintaining and caching the path's parts and string representation both optional and not very useful, so this commit moves the `_drv`, `_root`, `_tail_cached` and `_str` slots from `PurePathBase` to `PurePath`. Only `_raw_paths` and `_resolving` slots remain in `PurePathBase`. This frees the ABCs from the burden of some of pathlib's hardest-to-understand code.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Feature or enhancement
pathlib._abc.PurePathBase
andPathBase
have a few slots and methods that facilitate fast path object generation:_drv
,_root
,_tail_cached
: roughly, the result ofos.path.splitroot()
on the path_str
: the normalized path string_make_child_relpath()
: used when walking directories_from_parsed_parts()
: used inparent
,parents
,with_name()
,relative_to()
These carefully-tuned features are useful in
pathlib
, where speed is important, but much less so inpathlib._abc
, where the clarity of the interfaces and interactions should win out.It should be possible to move them to
PurePath
, leavingPurePathBase
with only_raw_paths
and_resolving
slots.This should not affect the behaviour or performance of the public classes in
pathlib
proper.Linked PRs
pathlib.Path.glob()
tests for symlink loop handling #113763pathlib._abc.PurePathBase.name
#113531pathlib._abc.PurePathBase.parent
#113530pathlib._abc.PurePathBase.relative_to()
#113529pathlib._abc.PurePathBase.is_absolute()
#113820pathlib._abc.PathBase._make_child_relpath()
#113532pathlib._abc.PathBase.resolve()
#113782pathlib._abc.PurePathBase
#113559pathlib._abc.PurePathBase.relative_to()
(again) #113882pathlib._abc.PurePathBase.parts
#113883The text was updated successfully, but these errors were encountered: