Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutil.copy() inefficient implementation in Windows #88745

Closed
sfmc mannequin opened this issue Jul 7, 2021 · 4 comments
Closed

shutil.copy() inefficient implementation in Windows #88745

sfmc mannequin opened this issue Jul 7, 2021 · 4 comments
Assignees
Labels
3.12 bugs and security fixes OS-windows performance Performance or resource usage

Comments

@sfmc
Copy link
Mannequin

sfmc mannequin commented Jul 7, 2021

BPO 44579
Nosy @pfmoore, @tjguk, @zware, @eryksun, @zooba

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-07-07.12:11:29.301>
labels = ['3.8', 'OS-windows', 'performance']
title = 'shutil.copy() inefficient implementation in Windows'
updated_at = <Date 2021-07-07.14:39:41.523>
user = 'https://bugs.python.org/sfmc'

bugs.python.org fields:

activity = <Date 2021-07-07.14:39:41.523>
actor = 'eryksun'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Windows']
creation = <Date 2021-07-07.12:11:29.301>
creator = 'sfmc'
dependencies = []
files = []
hgrepos = []
issue_num = 44579
keywords = []
message_count = 3.0
messages = ['397076', '397077', '397091']
nosy_count = 6.0
nosy_names = ['paul.moore', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'sfmc']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue44579'
versions = ['Python 3.8']

Linked PRs

@sfmc
Copy link
Mannequin Author

sfmc mannequin commented Jul 7, 2021

In Windows shutil.copy() uses _copyfileobj_readinto which copies file in user mode.
In Windows there is an fast API to copy file in kernel mode: CopyFile (see https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-copyfile).

@sfmc sfmc mannequin added stdlib Python modules in the Lib dir 3.8 only security fixes performance Performance or resource usage labels Jul 7, 2021
@eryksun
Copy link
Contributor

eryksun commented Jul 7, 2021

In Windows there is an fast API to copy file in kernel mode: CopyFile

The possibility of calling CopyFileEx() for shutil.copy2() is discussed in bpo-30044. Note that CopyFileEx() is a high-level Windows API function, not a "kernel mode" copy. It opens the source and destination files and makes multiple system calls in order to copy file data and metadata (e.g. system calls such as NtOpenFile, NtCreateFile, NtReadFile, NtWriteFile, NtQueryInformationFile, NtSetInformationFile, NtQueryEaFile, NtSetEaFile, NtQuerySecurityObject, etc). This includes copying the primary data stream, alternate data streams, file attributes, extended file attributes, and security resource attributes.

@zooba
Copy link
Member

zooba commented Jul 7, 2021

Note that CopyFileEx() is a high-level Windows API function, not a "kernel mode" copy.

This is true today, but could change whenever Windows feels like changing it. If we switch to the native API then we'll get any advantage there automatically.

The only challenge is in managing changes to semantics (that is, anything extra we do to "match" Unix that isn't normally how copies work on Windows - personally, I'd rather be more native on Windows anyway).

@eryksun eryksun added OS-windows and removed stdlib Python modules in the Lib dir labels Jul 7, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added 3.12 bugs and security fixes and removed 3.8 only security fixes labels Sep 7, 2022
@zooba zooba self-assigned this May 27, 2023
@zooba
Copy link
Member

zooba commented May 27, 2023

I think it's time to do this one. I've assigned it to myself, but if someone else gets an implementation together first (it could use a CopyFileEx implementation in _winapi) I'm happy to review and merge.

Performance enhancement can be backported to 3.12. The behaviour of copy2 should be totally unchanged by this, except in obscure edge cases (where they'll likely be more consistent with every other Windows app as a result of the change).

We should also clearly document that copy2 may be significantly faster on Windows than the other copy* functions. Probably also worth looking for optimisations on other platforms as well, particularly if they preserve stat, so that we can say that copy2 is fastest on most platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes OS-windows performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

4 participants