-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CopyFileEx for fs::copy on Windows #26751
Conversation
My only worry here is that the semantics of |
The size is kind of a lie (it's racy) although |
Another option is to use |
A further advantage to using |
I wrote up an alternative version that uses |
(actually, I was kind of hoping the devs would chime in and say "we don't care" so I could submit a patch that uses the BTRFS clone ioctl on Linux which doesn't provide any way of getting this information. IMO, |
@Stebalien |
@retep998 I know. However, the rust docs don't actually say what the return value of |
try!(cvt(unsafe { | ||
c::CopyFileW(pfrom.as_ptr(), pto.as_ptr(), libc::FALSE) | ||
})); | ||
stat(to).map(|attr| attr.size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried about this implementation because it means that the file could be successfully copied but then this later call to stat
could fail, causing the entire operation to be considered as failing. Basically the atomicity of this operation means that it can be somewhat significantly different from the unix implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm well no nevermind, there's a few steps taken in the unix implementation (e.g. copy then set permissions), so this may not be so bad after all.
Can you also add some tests for the various error modes here? For example copying from a file onto a directory, copying from nothing onto something, etc. It'd be good to just ensure that some common error cases are consistent across platforms. |
04df1da
to
8d7fe27
Compare
Implementation switched to the |
3ccf636
to
19a3a84
Compare
@@ -1814,6 +1814,18 @@ mod tests { | |||
check!(fs::set_permissions(&out, attr.permissions())); | |||
} | |||
|
|||
#[cfg(windows)] | |||
#[test] | |||
fn copy_file_preserves_streams() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems surprising! Could you elaborate on what's going on here? E.g. how come this test is only enabled for Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In windows NTFS), every file name maps to multiple actual files called streams. By default, reading/writing/deleting operates on the "anonymous" stream but you can operate on other streams. Alternate streams are kind of like extended attributes but more flexible.
https://technet.microsoft.com/en-us/sysinternals/bb897440.aspx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the test specific to Windows since it depends on file streams which are specific to NTFS, and Windows is the only OS we support that consistently uses NTFS. Also, it depends on the implementation properly copying over all the file streams, which the manual implementation does not do, so even if another OS did support NTFS file streams, this test would fail on such an OS at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To elaborate further on what's going on, I'm creating the file in.txt
but instead of writing to the default data stream, I'm instead writing to the bunny
stream.
If I copy the file to a new location it brings all the streams with it, and the number of bytes copied (what I assume the return value from fs::copy
means) is the same as the number of bytes I wrote to the bunny
stream since its the only stream with data.
When getting the size of the file, I'm getting that information on the default data stream, which is empty so its size is 0. If I wanted information on the other streams I'd have to explicitly specify them by name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for the explanations!
⌛ Testing commit 19a3a84 with merge af09f40... |
💔 Test failed - auto-mac-32-opt |
8b8a562
to
1d20269
Compare
Adds a couple more tests for fs::copy Signed-off-by: Peter Atashian <retep998@gmail.com>
Okay, fixed the build errors so it builds according to @dotdash |
Using the OS mechanism for copying files allows the OS to optimize the transfer using stuff such as [Offloaded Data Transfers (ODX)](https://msdn.microsoft.com/en-us/library/windows/desktop/hh848056%28v=vs.85%29.aspx). Also preserves a lot more information, including NTFS [File Streams](https://msdn.microsoft.com/en-us/library/windows/desktop/aa364404%28v=vs.85%29.aspx), which the manual implementation threw away. In addition, it is an atomic operation, unlike the manual implementation which has extra calls for copying over permissions. r? @alexcrichton
👍 |
On Windows with the NTFS filesystem, `fs::copy` would return the sum of the lengths of all streams, which can be different from the length reported by `metadata` and thus confusing for users unaware of this NTFS peculiarity. This makes `fs::copy` return the same length `metadata` reports which is the value it used to return before PR rust-lang#26751. Note that alternate streams are still copied; their length is just not included in the returned value. This change relies on the assumption that the stream with index 1 is always the main stream in the `CopyFileEx` callback. I could not find any official document confirming this but empirical testing has shown this to be true, regardless of whether the alternate stream is created before or after the main stream. Resolves rust-lang#44532
Made `fs::copy` return the length of the main stream On Windows with the NTFS filesystem, `fs::copy` would return the sum of the lengths of all streams, which can be different from the length reported by `metadata` and thus confusing for users unaware of this NTFS peculiarity. This makes `fs::copy` return the same length `metadata` reports which is the value it used to return before PR #26751. Note that alternate streams are still copied; their length is just not included in the returned value. This change relies on the assumption that the stream with index 1 is always the main stream in the `CopyFileEx` callback. I could not find any official document confirming this but empirical testing has shown this to be true, regardless of whether the alternate stream is created before or after the main stream. Resolves #44532
Using the OS mechanism for copying files allows the OS to optimize the transfer using stuff such as Offloaded Data Transfers (ODX).
Also preserves a lot more information, including NTFS File Streams, which the manual implementation threw away.
In addition, it is an atomic operation, unlike the manual implementation which has extra calls for copying over permissions.
r? @alexcrichton