Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slowness on Python 3.11 when updating an existing large environment. #12080

Merged
merged 4 commits into from
Jun 16, 2023
Merged

Fix slowness on Python 3.11 when updating an existing large environment. #12080

merged 4 commits into from
Jun 16, 2023

Conversation

mauritsvanrees
Copy link
Contributor

Fixes issue #12079 by adding a simple cache on AlreadyInstalledCandidate.version.

@@ -376,7 +377,9 @@ def name(self) -> str:

@property
def version(self) -> CandidateVersion:
return self.dist.version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why the self.dist.version line is slow? And how is it related to Python 3.11?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reiterate for clarity: it is not Python 3.11 per se, but it is because 3.11 uses importlib.metadata by default, instead of pkg_resources. You see the same on Python 3.10 with _PIP_USE_IMPORTLIB_METADATA=1, and 3.11 is fast with _PIP_USE_IMPORTLIB_METADATA=0.

Getting one version is also not slow, but when doing bin/pip install Plone -c https://dist.plone.org/release/6.0.5/constraints.txt it gets called 440,106 times, so even milliseconds add up. I see that with the code from this PR, self.dist.version only gets called one time per package, or 236 times total. For setuptools alone, self.version gets called 48,891 times.

The difference is in the class of the distribution, and how it gets the version. It is either pip._internal.metadata.pkg_resources.Distribution or pip._internal.metadata.importlib._dists.Distribution.

I get confused following the pkg_resources case, but for importlib you get into the standard lib:

> /Users/maurits/.pyenv/versions/3.11.2/lib/python3.11/importlib/metadata/__init__.py(629)version()
-> @property
(Pdb) l
629  ->	    @property
630  	    def version(self):
631  	        """Return the 'Version' metadata for the distribution package."""
632  	        return self.metadata['Version']

That line 632 actually reads the METADATA file, making it slow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the main difference is probably pkg_resources caches this value in-memory, while importlib.metadata needs to re-parse the file every time. The fix makes sense then.

It would be best to modify the news fragment to reflect that this is about importlib.metadata (and not 3.11 per se), and when pip repeatedly accesses an already-installed distribution (and not all “large environments” are affected, only if that environment has a lot of overlaps with the to-be-installed requirement set).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (and rebased on main).

news/12079.bugfix.rst Outdated Show resolved Hide resolved
news/12079.bugfix.rst Outdated Show resolved Hide resolved
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@uranusjr uranusjr merged commit 5190ef7 into pypa:main Jun 16, 2023
@mauritsvanrees mauritsvanrees deleted the maurits-issue-12079-slow-importlib branch June 16, 2023 13:07
@mauritsvanrees
Copy link
Contributor Author

@uranusjr Thanks for the review and the guidance!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants