Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-29982: Add "ignore_cleanup_errors" param to tempfile.TemporaryDirectory() #24793

Merged
merged 6 commits into from
Mar 14, 2021

Conversation

CAM-Gerlach
Copy link
Member

@CAM-Gerlach CAM-Gerlach commented Mar 8, 2021

Resolves bpo-29982 by adding an ignore_cleanup_errors parameter to tempfile.TemporaryDirectory(), which allows for a best-effort cleanup without stopping on errors, such as PermissionErrors on Windows due to files being open, transient access failures or other runtime exceptions. This behavior matches that when passing ignore_errors=True to shutil.rmtree, and allows using the nicer higher-level interface, context-manager support and more robust removal implementation of tempfile.TemporaryDirectory() to clean up as much of the temporary directory as possible, without stopping and raising an exception mid-cleanup, which can otherwise potentially occur at a non-deterministic time when the directory is garbage-collected, or during interpreter shutdown.

Additionally, as the other part of the issue described in bpo-29982 (and as required for the tests and many real-world use cases), ensures that cleanup() runs correctly on subsequent attempts if the directory was not completely cleaned up after the first call, instead of silently (and surprisingly to the user) failing to do anything at all.

Example use cases for this include regebro/pyroma#28 (which prompted this proposal), the original scenario described in bpo-29982 , and one of the two use cases described by @asottile in bpo-25024 Since I assume everything will be squashed to one commit anyway, I've split this up into several to make it easier to review and modify.

Key implementation questions to draw reviewer attention to:

  • Do we want to call the parameter ignore_cleanup_errors or something else? Its on the long side, but the shorter alternatives I considered (e.g. ignore_errors) were not as clear and descriptive. ignore_errors is what shutil.rmtree uses but given its in the TemporaryDirectory constructor, it could otherwise mislead users into thinking that errors creating the temporary directory would be ignored, when in fact this is only the case for cleanup.
  • Should we add an ignore_errors optional parameter to cleanup() as well? This would be a simple but useful addition so calling code can explicitly specify at manual cleanup time whether they want an error to be returned, rather than being locked in to one or the other after object creation (and would allow for more robust unit tests), but I didn't add it yet to avoid it getting out of scope.
  • Should the new tests be modified (somehow?) to guarantee initial removal will fail on all platforms? Given the multiple mitigations against removal failing for various reasons, the simplest way to reliably cause removal to fail on real files was just leaving it open, but that only actually causes an error on Windows, AFAIK. This means that ignore_errors=True is tested with errors on Windows, and without them on Linux and macOS. Is there a straightforward way to ensure an error will occur on *nix, in a way that the existing cleanup code cannot handle? Maybe OS-specific locks?
  • Does it need a Whats New entry? I didn't add one since it shouldn't have backward compatibility implications and its just adding a single parameter (along with a corresponding fix to the behavior of cleanup), but I do see some other entries related to adding new parameters to callables, so I'm not sure.

Thanks!

https://bugs.python.org/issue29982

@CAM-Gerlach
Copy link
Member Author

Not sure why test_sendfile_close_peer_in_the_middle_of_receiving of test.test_asyncio.test_sendfile.ProactorEventLoopTests failed on Azure on Windows x64 with the ConnectionError not raised; it didn't fail anywhere else, nor can I reproduce it locally, either across a run of the full suite or multiple re-runs of the asyncio tests with relatively high-stress test settings. Therefore, given it has failed for 10% of of the last 200+ runs, I'm inclined to believe its likely a random failure.

@gvanrossum
Copy link
Member

gvanrossum commented Mar 9, 2021

Yeah, rumor has it that test_asyncio is flaky.

@gvanrossum
Copy link
Member

Maybe on non Windows you can force failure by creating a subdirectory and set its permissions to zero? (Or at least x555.)

@CAM-Gerlach
Copy link
Member Author

CAM-Gerlach commented Mar 9, 2021

Thanks @gvanrossum ! Actually, that was my first thought as well, but TemporaryDirectory is too smart for that; since PR #10320 it will reset the permissions to 0o700, clear any read-only flag and retry if deletion fails (which I can confirm works on Linux). The same is true of fcntl.lock(file, fnctl.LOCK_EX) as well.

If there aren't any better ideas, I see two options:

  • Given that post-PR bpo-26660, bpo-35144: Fix permission errors in TemporaryDirectory cleanup. #10320 / Py3.8 , the only real-world reports of cleanup() failing to remove files that I'm aware of (including bpo-29982 itself) have been on Windows, we keep the current tests and just set the expected number of files after cleanup based on the platform (we could make the test Windows only, but this way the ignore_cleanup_errors=True code path still gets tested on all platforms, just without errors).
  • Or, we make this set of tests Windows-only, and add another that monkeypatches the os.unlink() used by shutil to raise OSError when passed a specific subdirectory name that we create (not sure how easy this is to do with Unittest; I'm only really familiar with Pytest).

Thoughts?

@gvanrossum
Copy link
Member

You could use unittest.mock to patch unlink, but that would provide very limited value if it's as hard as you say to trigger this condition. So I am fine with just testing this on Windows.

@CAM-Gerlach
Copy link
Member Author

Thanks; I vaguely recalled mock.patch(), since in the past I've mostly used pytest's own monkeypatch() for legacy reasons.

I went ahead with the first option, setting the expected value of cleanup based on OS. If any other esoteric supported platforms still have errors attempting to clean up files locked by opening, it will show up when tested by the buildbot fleet, and we can revise the test accordingly to include them. Alternatively, we can explicitly limit the entire test to just Windows, which is cleaner but less comprehensive—but like you say, not sure it makes a huge difference either way.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's go with this.

@CAM-Gerlach
Copy link
Member Author

Thanks @gvanrossum!

mikepqr added a commit to mikepqr/resume.md that referenced this pull request Feb 13, 2022
We were getting errors like `PermissionError: [WinError 5] Access is
denied:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_k098zhaj\\CrashpadMetrics-active.pma'`
and `Exception ignored in: <finalize object at 0x16d9e039d60; dead> ...
NotADirectoryError: [WinError 267] The directory name is invalid:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_3aoun3f2\\CrashpadMetrics-active.pma'`
in CI on Windows. See e.g.
https://github.com/mikepqr/resume.md/runs/5172290981 and
https://github.com/mikepqr/resume.md/runs/5172234014.

The root cause was that Chrome creates a file the python process does
not have permission to delete. See
puppeteer/puppeteer#2778. Because
TemporaryDirectory is intended to be used as a context manager there is
no way to prevent it logging an error when cleanup fails.

The fix is to switch to the lower level tempfile.mkdtemp, and make a
good faith attempt to clean it up manually, logging failure at the debug
level (while adding a new --debug option).

A more sophisticated fix would be to backport the new
ignore_cleanup_errors option added in python 3.10
(python/cpython#24793), but this will do.

Fixes #13
mikepqr added a commit to mikepqr/resume.md that referenced this pull request Feb 13, 2022
We were getting errors like `PermissionError: [WinError 5] Access is
denied:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_k098zhaj\\CrashpadMetrics-active.pma'`
and `Exception ignored in: <finalize object at 0x16d9e039d60; dead> ...
NotADirectoryError: [WinError 267] The directory name is invalid:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_3aoun3f2\\CrashpadMetrics-active.pma'`
in CI on Windows. See e.g.
https://github.com/mikepqr/resume.md/runs/5172290981 and
https://github.com/mikepqr/resume.md/runs/5172234014.

The root cause was that Chrome creates a file the python process does
not have permission to delete. See
puppeteer/puppeteer#2778. Because
TemporaryDirectory is intended to be used as a context manager there is
no way to prevent it logging an error when cleanup fails.

The fix is to switch to the lower level tempfile.mkdtemp, and make a
good faith attempt to clean it up manually, logging failure at the debug
level (while adding a new --debug option).

A more sophisticated fix would be to backport the new
ignore_cleanup_errors option added in python 3.10
(python/cpython#24793), but this will do.

Fixes #13
icanhasmath pushed a commit to ActiveState/cpython that referenced this pull request Sep 27, 2023
averagewagon pushed a commit to averagewagon/resume that referenced this pull request Jun 27, 2024
We were getting errors like `PermissionError: [WinError 5] Access is
denied:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_k098zhaj\\CrashpadMetrics-active.pma'`
and `Exception ignored in: <finalize object at 0x16d9e039d60; dead> ...
NotADirectoryError: [WinError 267] The directory name is invalid:
'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\resume.md_3aoun3f2\\CrashpadMetrics-active.pma'`
in CI on Windows. See e.g.
https://github.com/mikepqr/resume.md/runs/5172290981 and
https://github.com/mikepqr/resume.md/runs/5172234014.

The root cause was that Chrome creates a file the python process does
not have permission to delete. See
puppeteer/puppeteer#2778. Because
TemporaryDirectory is intended to be used as a context manager there is
no way to prevent it logging an error when cleanup fails.

The fix is to switch to the lower level tempfile.mkdtemp, and make a
good faith attempt to clean it up manually, logging failure at the debug
level (while adding a new --debug option).

A more sophisticated fix would be to backport the new
ignore_cleanup_errors option added in python 3.10
(python/cpython#24793), but this will do.

Fixes #13
icanhasmath pushed a commit to ActiveState/cpython that referenced this pull request Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants