Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DistutilsMetaFinder breaks distutils submodule imports when old setuptools is higher on path #3073

Open
jjhelmus opened this issue Feb 2, 2022 · 5 comments
Assignees
Labels
bug Needs Implementation Issues that are ready to be implemented.

Comments

@jjhelmus
Copy link
Contributor

jjhelmus commented Feb 2, 2022

setuptools version

setuptools====60.7.0

Python version

Python 3.7

OS

macOS

Additional environment information

Similar to #2957, this occurs when an old setuptools is on path but a newer setuptools (with a distutils shimming .pth file) is installed.

Description

This is a followup to #2957 where the .pth shim broke distutils imports when an older setuptools is higher on path. The fix in #2962 allows distutils to be imported as well as submodules, but those submodules are not usable.

I'm running into this issue when running a PEX which contains an older version of setuptools but the Python interpreter has a newer version of setuptools. In the bootstrap phase pex removes the new setuptools from sys.path and adds the older setuptools. DistutilsMetaFinder is not removed from sys.meta_path during this phase and causes distutils submodules to fail.

Expected behavior

The presence of a new-but-not-loadable setuptools on sys.path doesn't break distutils submodules.

How to Reproduce

As in #2957 a reproducer is:

python3 -m venv /tmp/bang && source /tmp/bang/bin/activate  # give us a clean venv to work in
python -m pip install 'setuptools>60'  # install new setuptools to venv site-packages
python -m pip install 'setuptools<45' -t .  # install old setuptools to cwd- higher than site-packages on path for most systems
python -c "import distutils.util; print(distutils.util)"

Output

Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'distutils' has no attribute 'util'
@jjhelmus jjhelmus added bug Needs Triage Issues that need to be evaluated for severity and status. labels Feb 2, 2022
@jaraco jaraco self-assigned this Feb 11, 2022
@jaraco
Copy link
Member

jaraco commented Feb 11, 2022

Thanks for your patience. I'm getting ramped up on this issue today and I'd like to come up with a robust fix.

Thanks for your investigation and linking to prior issues. I'll continue to explore this issue here.

@jaraco
Copy link
Member

jaraco commented Feb 11, 2022

I'm starting to get a grasp of the underlying behavior here. At first, I was surprised that the technique as currently implemented wouldn't have the intended behavior. I could replicate the reported failure, but when I looked at the importer code, I would have expected the exceptional condition to take place and thus allow a fallback to the traditional behavior.

But then I realized that import setuptools._distutils causes an import of setuptools and upon inspection I can see that earlier versions of setuptools will import distutils as part of importing setuptools.

I don't yet understand Python's import internals enough to understand why that results in a distutils module without its submodule attributes. I might have expected the importer to crash, but I guess the importer has to be robust to lots of circular dependencies.

My only instinct here is that the act of importlib.import_module('setuptools') needs to be somehow less aggressive, perhaps by limiting to only the setuptools relevant to the pth hook.

@jaraco jaraco added Needs Implementation Issues that are ready to be implemented. and removed Needs Triage Issues that need to be evaluated for severity and status. labels Feb 11, 2022
@jaraco
Copy link
Member

jaraco commented Feb 11, 2022

I put together this patch:

setuptools bugfix/3073-old-setuptools $ git diff main
diff --git a/_distutils_hack/__init__.py b/_distutils_hack/__init__.py
index 1f8daf498..14d9bd996 100644
--- a/_distutils_hack/__init__.py
+++ b/_distutils_hack/__init__.py
@@ -96,6 +96,13 @@ class DistutilsMetaFinder:
         import importlib.abc
         import importlib.util
 
+        spec = importlib.util.find_spec('setuptools')
+        setuptools_site = os.path.dirname(os.path.dirname(spec.origin))
+        hook_site = os.path.dirname(os.path.dirname(__file__))
+        if setuptools_site != hook_site:
+            print(setuptools_site, hook_site)
+            return
+
         try:
             mod = importlib.import_module('setuptools._distutils')
         except Exception:

It does have the desired effect of working around the issue, but it has the undesirable effect that it breaks pip upgrades of setuptools from source (because the directory with the .pth file is different from the source directory, so the distutils hack doesn't load, and setuptools assertion to ensure_local_distutils fails).

In other words, some workflows are actually depending on the behavior that distutils is loaded from a setuptools where the hack is not.

I considered perhaps making this exception only when imported from .pth but not when installed as part of ensure_local_distutils, but I'm not confident that can be made to work.

Other options I'm considering:

  • allow alternate copies of setuptools to supply distutils as long as they actually have it (only bypass after detecting very old setuptools).
  • not supporting mixed environments with older setuptools.

I don't have any good solutions yet. The problem landscape is already very complicated (also by things like pip exclusions), so I'm not confident that the problem space is well understood or even has a viable solution.

@nitzmahone
Copy link
Contributor

(original reporter of this issue here): Yeah, we're also still hitting different permutations of this with current setuptools where the "wrong distutils" assert is firing in mixed situations where it shouldn't (usually when a system-site-packages venv with ancient setuptools is involved)- we're doing the envvar hack disable to limp along, but we really need to figure out a broader solution before this gets bigger (and it likely will).

Still willing to pitch in more on that if needed...

@jjhelmus
Copy link
Contributor Author

I explored the environment where this is issue is occurring and I think I have an understanding on the problem.

The root of the issue is that distutils is loaded twice. The second time it is loaded,distutils.util is present in sys.modules and therefore util is not added as an attribute of the distutils module.

A breakdown of the process is:

  • import distutils.util starts the import system. distutils will be imported first followed by distutils.util.
  • distutils is not in sys.modules so the import protocol begins searching using sys.meta_path.
  • DistutilsMetaFinder is the first (or an early) finder in sys.meta_path. The find_spec method is called which calls spec_for_distutils
  • mod = importlib.import_module('setuptools._distutils') starts the import system, setuptools will be imported first followed by setuptools._distutils. The import protocol is still determining if the DistutilsMetaFinder finder is able to handle distutils and distutils.util will be imported once this completes.
  • setuptools is found and then loaded via one of the standard finders and loaders. This is the "old" version of setuptools which does not include _distutils and appears earlier in sys.path than the "new" version.
  • During loading, setuptools imports distutils and distutils.utils. This is the first time distutils is loaded. When distutils.util is loaded, it is added as an attribute of the distutils module. All of these modules (setuptools, distutils, distutils.util) are stored as entries in sys.modules.
  • The import system continues and tries to find setuptools._distutils. This fails because the "old" setuptools does not include this module.
  • The except block inDistutilsMetaFinder.spec_for_distutils is triggered and returns None indicating that the finder is not able to handle distutils. At this point, sys.modules has entries for setuptools, distutils, and distutils.util from when setuptools was loaded.
  • As the DistutilsMetaFinder finder was not able to handle distutils the remaining finders in sys.meta_path are queried.
  • One of the finders does handled distutils and the module is loaded. This is the second time distutils is loaded. The distutils entry in sys.modules is replaced with the distutils module loaded in this step. This module does not have an util attribute as distutils._utils has not been loaded in this context. The distutils variable is bound to this entry.
  • The initial import of distutils is complete so the import system continues to distutils.util.
  • distutils.util is found in sys.modules as it was added when setuptools was loaded. No further action is needed. Since the process ends,util is not added as an attribute to the distutilsmodule in sys.modules nor the distutils variable.

The key take away is that importing setuptools has a side effect on sys.modules.

Reverting the changes to sys.modules when the finder is not handling the module will address this issue. One possibility:

diff --git a/_distutils_hack/__init__.py b/_distutils_hack/__init__.py
index 605a6edc..547dcbfd 100644
--- a/_distutils_hack/__init__.py
+++ b/_distutils_hack/__init__.py
@@ -98,8 +98,10 @@ class DistutilsMetaFinder:
         import importlib.util

         try:
+            sys_modules_before_import = sys.modules.copy()
             mod = importlib.import_module('setuptools._distutils')
         except Exception:
+            sys.modules = sys_modules_before_import
             # There are a couple of cases where setuptools._distutils
             # may not be present:
             # - An older Setuptools without a local distutils is

Another option is to call clear_distutils in the except block to remove only distutil related entries from sys.modules.

I'd be happy to submit a PR with either change.

I found the following toy example helpful while exploring this behavior
# setup packages a and b. b has an x submodule. a imports b.x
mkdir -p a
mkdir -p b/x
touch b/__init__.py
touch b/x/__init__.py
echo "import b.x" > a/__init__.py

This script replicates the behavior:

class Finder:
def find_spec(self, fullname, path, target=None):
  if fullname == "b":
    import a
  return

import sys
sys.meta_path.insert(0, Finder())

import b.x
print(b.x)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Needs Implementation Issues that are ready to be implemented.
Projects
None yet
Development

No branches or pull requests

3 participants