Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39791 native hooks for importlib.resources.files #20576

Merged
merged 12 commits into from
Jun 8, 2020
Merged

bpo-39791 native hooks for importlib.resources.files #20576

merged 12 commits into from
Jun 8, 2020

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Jun 1, 2020

Integrate native .files() support for SourceFileLoader and ZipImporter. Remove the 'fallback' behavior and rely entirely on loader.files() for all resources.

https://bugs.python.org/issue39791

@jaraco
Copy link
Member Author

jaraco commented Jun 1, 2020

Still to do in another PR, I'd like to replace the zipimporter Resource Reader with a TraversableAdapter over ZipImporter.files, but I'm deferring that for now to get these changes in before the next 3.9 beta.

@jaraco jaraco added the needs backport to 3.9 only security fixes label Jun 1, 2020
@jaraco
Copy link
Member Author

jaraco commented Jun 3, 2020

I've figured out why the tests are failing.

It's the importlib_metadata tests that are failing, and it's because the fixtures for setting up the example zip packages are no longer named as expected.

This change alters the way that SourceFileLoader.resource_path works, replacing it with a version that unconditionally raises FileNotFoundError, which causes it to defer to the SourceFileLoader.open_resource which creates a temporary file. The temporary file is created correctly and available when expected, but its name is munged to tmpfjko__muexample-21.12-py3.6.egg and when importlib.metadata tries to infer the package name from that filename, it doesn't find example.

So there are really several issues here:

  1. SourceFileLoader.resource_path no longer returns a simple path to an extant file, but relies on open_resource to store a temporary copy.
  2. importlib.resources._path_from_reader will use _tempfile to create a temporary file and that filename won't match the original filename.
  3. importlib.metadata is too reliant on the name of the file for zip files on sys.path to determine their "name".

(1) was intentional knowing it was somewhat inefficient but creates a more uniform interaction with resources.
(3) was intentionally added to improve performance of importlib.metadata.
(2) probably was considered an internal implementation detail and probably can be refined to support this use case.

Regardless, there's more work to be done here.

@jaraco
Copy link
Member Author

jaraco commented Jun 5, 2020

As I thought about it more, I realized that by updating importlib.metadata to rely on importlib.resources.files instead of the importlib.resources.path compatibility layer, it will bypass the issue, so I'm working on a refresh of importlib.metadata to effect that change.

@jaraco
Copy link
Member Author

jaraco commented Jun 5, 2020

This is closer now. Just two failures:

Traceback (most recent call last):
  File "/home/vsts/work/1/s/Lib/test/test_importlib/test_files.py", line 13, in test_read_bytes
    actual = files.joinpath('utf-8.file').read_bytes()
  File "/home/vsts/work/1/s/Lib/zipfile.py", line 2320, in read_bytes
    with self.open('rb') as strm:
  File "/home/vsts/work/1/s/Lib/zipfile.py", line 2304, in open
    stream = self.root.open(self.at, zip_mode, pwd=pwd)
  File "/home/vsts/work/1/s/Lib/zipfile.py", line 1502, in open
    zinfo = self.getinfo(name)
  File "/home/vsts/work/1/s/Lib/zipfile.py", line 1429, in getinfo
    raise KeyError(
KeyError: "There is no item named 'utf-8.file' in the archive"

@jaraco
Copy link
Member Author

jaraco commented Jun 6, 2020

I may have made a grave miscalculation - that a loader could provide a .files() method taking no parameters that would be suitable for loading resources for a package served by that loader. And that's true for the SourceFileLoader, but it's not true for the zipimporter, so it's not true in general. This oversight was caught by the remaining failing tests. A loader must necessarily be supplied the relevant module name to resolve.

I believe this is why get_resource_reader() was the interface on the loader, so it could solicit the package context. I believe this can be made to work, but it's going to take some work.

@jaraco
Copy link
Member Author

jaraco commented Jun 6, 2020

I believe this is ready to go. If Brett or Barry have time to review today or tomorrow, that would be great. Otherwise, I'll submit and continue to iterate. In particular, the docs will need to be updated on the "integration", now that Loader.get_resource_reader is the mechanism through which a loader will provide TraversableResources.

return self.path


class ZipReader(FileReader):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably going to decouple FileReader from ZipReader. The only shared functionality is the one-line files method.

@jaraco jaraco merged commit 843c277 into python:master Jun 8, 2020
@miss-islington
Copy link
Contributor

Thanks @jaraco for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jun 8, 2020
* Provide native .files support on SourceFileLoader.

* Add native importlib.resources.files() support to zipimporter. Remove fallback support.

* make regen-all

* 📜🤖 Added by blurb_it.

* Move 'files' into the ResourceReader so it can carry the relevant module name context.

* Create 'importlib.readers' module and add FileReader to it.

* Add zip reader and rely on it for a TraversableResources object on zipimporter.

* Remove TraversableAdapter, no longer needed.

* Update blurb.

* Replace backslashes with forward slashes.

* Incorporate changes from importlib_metadata 2.0, finalizing the interface for extension via get_resource_reader.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit 843c277)

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
@bedevere-bot
Copy link

GH-20703 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Jun 8, 2020
@jaraco jaraco deleted the feature/39791-resources-files branch June 8, 2020 01:01
miss-islington added a commit that referenced this pull request Jun 8, 2020
* Provide native .files support on SourceFileLoader.

* Add native importlib.resources.files() support to zipimporter. Remove fallback support.

* make regen-all

* 📜🤖 Added by blurb_it.

* Move 'files' into the ResourceReader so it can carry the relevant module name context.

* Create 'importlib.readers' module and add FileReader to it.

* Add zip reader and rely on it for a TraversableResources object on zipimporter.

* Remove TraversableAdapter, no longer needed.

* Update blurb.

* Replace backslashes with forward slashes.

* Incorporate changes from importlib_metadata 2.0, finalizing the interface for extension via get_resource_reader.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
(cherry picked from commit 843c277)

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
ambv added a commit to ambv/cpython that referenced this pull request Jun 9, 2020
ambv added a commit that referenced this pull request Jun 9, 2020
arun-mani-j pushed a commit to arun-mani-j/cpython that referenced this pull request Jul 21, 2020
* Provide native .files support on SourceFileLoader.

* Add native importlib.resources.files() support to zipimporter. Remove fallback support.

* make regen-all

* 📜🤖 Added by blurb_it.

* Move 'files' into the ResourceReader so it can carry the relevant module name context.

* Create 'importlib.readers' module and add FileReader to it.

* Add zip reader and rely on it for a TraversableResources object on zipimporter.

* Remove TraversableAdapter, no longer needed.

* Update blurb.

* Replace backslashes with forward slashes.

* Incorporate changes from importlib_metadata 2.0, finalizing the interface for extension via get_resource_reader.

Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
jaraco added a commit to jaraco/cpython that referenced this pull request Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants