Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Improve File.Copy by using the block copy operation for ReFS on Windows #88695

Closed
wants to merge 19 commits into from

Conversation

hamarb123
Copy link
Contributor

@hamarb123 hamarb123 commented Jul 11, 2023

Fixes #86681.

Tasks:

The logic is based on https://github.com/microsoft/CopyOnWrite, but without the caching of volume information as it can change at any time, with the inclusion of logic to handle ADS correctly, and with inclusion of logic to emulate CopyFile's weird behaviour with symlinks.

The logic is currently is as follows (simplified)
  1. Open handle to source file.
  2. Read the volume information from the source file to check that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING.
  3. Open handle to the destination file, creating it if needed.
  4. Check destination file's volume is probably the same (I don't see a cheap way to confirm this for sure, and it will fail later in a sensible way anyway if it's different, so I'm not worried about it), and that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING.
  5. Read the source file's name (not elidable), and check it's not an ADS
  6. Read the destination file's name (elidable on Windows version TBD+), and check it's not an ADS
  7. Get the volume of the source file (from the filename we read, we can't use the one we're given in case it follows a symlink)
  8. Read the cluster size on the source volume.
  9. Ensure the destination file's size is 0.
  10. Make sure the destination file is not readonly.
  11. Make sure the destination file is sparse.
  12. Copy integrity information.
  13. Read the attributes of the source file.
  14. Set the file size of the destination file.
  15. Copy all of the blocks.
  16. Match the source sparseness.
  17. Update attributes and times to match the source.

This logic is yet again complicated by microsoft/CopyOnWrite#24, which means that we must avoid copying from/to an ADS at all costs. It will be fixed on a TBD version of windows, but we probably won't be able to remove this logic for a number of years, since we will presumably support Windows 10, and earlier versions of Windows 11 for the foreseeable future (we could special case it though).

There are currently 2 additional syscalls on the golden path to failure. This is not ideal, we need to measure its impact. I don't see how they are really avoidable, since 1 would be for opening the source file handle, and 1 would be to get the volume information which is where we check that its volume supports the operation.

This PR IS NOT ready for merging. It currently fails on non-ReFS volumes. This is intentional so the functionality can be checked properly locally (to make sure we don't just always error out on something silently), but will obviously be fixed before merging. Ideally we would add tests which run on ReFS, and tests to ensure that we have actually done a block copy operation as opposed to a normal copy, but I'm not sure how to do either of these.

There are also a number of Todo items, which I would like feedback on, along with a general code cleanup (which will be done at some point when the logic is more set in stone).

cc @danmoseley

- Initial code for ReFS block copy operation, currently throws exceptions whenever it fails to support checking if it's actually working (obviously needs to be changed before merging)
- Contains a number of TODO items to be resolved
- Works locally
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 11, 2023
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jul 11, 2023
@hamarb123 hamarb123 marked this pull request as draft July 11, 2023 19:58
@hamarb123 hamarb123 changed the title [WIP] Implement block copy operation for ReFS on Windows [WIP] Improve File.Copy by using the block copy operation for ReFS on Windows Jul 11, 2023
@ghost
Copy link

ghost commented Jul 12, 2023

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #86681.

The logic is based on https://github.com/microsoft/CopyOnWrite, but without the caching of volume information as it can change at any time, with the inclusion of logic to handle ADS correctly, and with inclusion of logic to emulate CopyFile's weird behaviour with symlinks.

The logic is currently is as follows (simplified):

  1. Open handle to source file, and read its attributes.
  2. Read the volume information from the source file to check that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING.
  3. Open handle to the destination file, creating it if needed.
  4. Check destination file's volume is probably the same (I don't see a cheap way to confirm this for sure, and it will fail later in a sensible way anyway if it's different, so I'm not worried about it), and that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING.
  5. Read the cluster size on the source volume.
  6. Ensure the destination file's size is 0.
  7. Make sure the destination file is not readonly.
  8. Make sure the destination file is sparse.
  9. Copy integrity information.
  10. Set the file size of the destination file.
  11. Copy all of the blocks.
  12. Match the source sparseness.
  13. Update attributes and times to match the source.

This logic is yet again complicated by microsoft/CopyOnWrite#24, which means that we must avoid copying from/to an ADS at all costs. It will be fixed on a TBD version of windows, but we probably won't be able to remove this logic for a number of years, since we will presumably support Windows 10, and earlier versions of Windows 11 for the foreseeable future (we could special case it though).

There are currently 3 additional syscalls on the golden path to failure. This is not ideal, we need to measure its impact. I don't see how they are really avoidable, since 1 would be for opening the source file handle, 1 would be reading the atrributes to double check we haven't opened a symlink (which could hide an ADS) (we need the real file path of what we have opened later sadly, so even without checking for ADS we still need to do this, unless we wanted to read the path from the handle), and 1 would be to get the volume information which is where we check that its volume supports the operation.

This PR IS NOT ready for merging. It currently fails on non-ReFS volumes. This is intentional so the functionality can be checked properly locally (to make sure we don't just always error out on something silently), but will obviously be fixed before merging. Ideally we would add tests which run on ReFS, and tests to ensure that we have actually done a block copy operation as opposed to a normal copy, but I'm not sure how to do either of these.

There are also a number of Todo items, which I would like feedback on, along with a general code cleanup (which will be done at some point when the logic is more set in stone).

cc @danmoseley

Author: hamarb123
Assignees: -
Labels:

area-System.IO, community-contribution, needs-area-label

Milestone: -

@huoyaoyuan huoyaoyuan removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 12, 2023
/*InlineData(":a", ""),*/
InlineData("", ":a")/*,
InlineData(":a", ":a")*/]
//todo: is copying from an ADS meant to fail?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone know if this is meant to fail. And if so, why?
Specifically, copying from an ADS fails, even today. But copying onto one doesn't for some reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeremyKuhne maybe knows

BTW randomly linking here to the old issue requesting we support alternate data streams. #49604 Not currently planned.

Copy link
Contributor Author

@hamarb123 hamarb123 Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do seem to currently "support" ADS btw. Most APIs work fine with them. Specifically copying from one doesn't seem to work, but you seem to be able to read and write to them fine as if they were files for example. I would think if we didn't support them, it would throw an exception instead for all operations (since it's pretty easy to detect), as opposed to only some of them.

That API would be more useful on platforms like macOS where it's done differently (ie. there isn't really a filename you can give the ADS). It would also be useful on Windows probably, but you can already do a number of things with them with .NET already on Windows.

result.Dispose();
//try
//{
/* */string? target = GetFinalLinkTarget(lpFileName, false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo: try/catch around this? Maybe even a version which won't throw if we care that much.

}

// Ensure the destination file is not readonly.
// Todo: are there other flags we need to exclude here?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anyone know if there are more flags we need to set. Does anyone know if we should be clearing the readonly flag, presumably this should potentially actually just fail. This also probably also needs to go before setting the file length to 0 if we are going to clear readonly.

@@ -418,5 +419,92 @@ public void EnsureThrowWhenCopyToNonSharedFile()
using var stream = new FileStream(file1, FileMode.Open, FileAccess.Read, FileShare.None);
Assert.Throws<IOException>(() => File.Copy(file2, file1, overwrite: true));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make it so tests (at least some of them) run on a virtual ReFS drive also? That would be ideal, since it runs in the temporary directory currently, and thus we would never be checking if the code works on ReFS in CI or when running it locally. It would also be useful for other OSes like macOS if we could to this, so we can check logic there properly also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue here "can we make tests temporarily create a ReFS volume"? I assume one could call Windows API to try to shrink the current volume and create a ReFS volume. Another alternative that's not entirely unreasonable is to create a test that needs manual setup. There are some manual console tests for example. Those only run on a developer's machine, and presumably only when they change relevant code.

On Mac/Linux possibly they could be created in memory? For example, here's a manual test we have that creates an EXFAT partition in memory:

sudo mkdir /mnt/ramdisk
sudo mount -t ramfs ramfs /mnt/ramdisk
sudo dd if=/dev/zero of=/mnt/ramdisk/exfat.image bs=1M count=512
sudo mkfs.exfat /mnt/ramdisk/exfat.image
sudo mkdir /mnt/exfatrd
sudo mount -o loop /mnt/ramdisk/exfat.image /mnt/exfatrd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue here "can we make tests temporarily create a ReFS volume"?

It's not just that. I think it would be good to run some of these tests in CI automatically (and potentially locally also). Would that be possible? It would also be useful to test exFAT stuff properly on macOS for example. My understanding is that this stuff is not really tested currently, which isn't great imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could potentially ask infra folks to set up a ReFS partition on the machines. Or we could try and do it at execution time, with shrinking as mentioned (seems slow though)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I'd defer worrying about that until we have code we know is good to go...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the pipeline YAML and unit tests for https://github.com/microsoft/CopyOnWrite - there's a pattern there for (a) a dev to specify an env var pointing to a ReFS volume to test on, and (b) when running as admin, create a ReFS VHD and run tests on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, our test automation always runs as admin. Except potentially on a dev machine.
We conventionally mark tests as outer loop if they are potentially disruptive/messy/slow like creating a VHD likely is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reached out to internal eng folks to ask how they'd feel about either updating images to have a ReFS partition, or allowing us to resize as in the yml above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea for the VHD would be to use the dotnet/runtime-assets repo to store the vhd file, which would at least cut the cost of creating the VHD and only leave the cost of mounting and unmounting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test, and the test about checking it's actually cloned should be moved to a new issue. That way we could add it for macOS at the same time.

@@ -418,5 +419,92 @@ public void EnsureThrowWhenCopyToNonSharedFile()
using var stream = new FileStream(file1, FileMode.Open, FileAccess.Read, FileShare.None);
Assert.Throws<IOException>(() => File.Copy(file2, file1, overwrite: true));
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would ideally have a test to ensure it actually does a block copy operation on ReFS. Does anyone know what APIs you can call to check this on Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to ask someone like @erikmav ? We don't in general have special knowledge of ReFS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. on macOS, there is fcntl with F_LOG2PHYS or F_LOG2PHYS_EXT, which can be used to check this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been poking around looking for this as e.g. a mode under fsutil. Still looking or will ask around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you have a small test VHD and attempt to duplicate a file that is so large it already occupies more than half of the disk.

@danmoseley
Copy link
Member

This logic is yet again complicated by microsoft/CopyOnWrite#24, which means that we must avoid copying from/to an ADS at all costs. It will be fixed on a TBD version of windows, but we probably won't be able to remove this logic for a number of years, since we will presumably support Windows 10, and earlier versions of Windows 11 for the foreseeable future (we could special case it though).

Yeesh! Glad you reported it. I see they say they'll backport it -- in general, I'd expect we only support running on fully patched OS -- but a BSOD is uncommonly serious. I wonder whether there is an API call we can use to determine whether we are on a patched machine, and if not, avoid the API call under all circumstances.

@danmoseley
Copy link
Member

Wow -- this is way more complexity than I had expected. I only looked a bit and at part of it. You've done a lot of work.

cc @JeremyKuhne in case he has meta thoughts.

@hamarb123
Copy link
Contributor Author

hamarb123 commented Jul 12, 2023

Wow -- this is way more complexity than I had expected. I only looked a bit and at part of it. You've done a lot of work.

Thanks :)
(it's a shame the windows filesystem team haven't just added a flag to CopyFileEx or something like that, it's much easier on macOS lol)

@danmoseley
Copy link
Member

haven't just added a flag to CopyFileEx or something like that

Sometimes this is where work like this in .NET can really help, exposing a simple easy API over something that's quite complex and hard to get right at the syscall level. (And even making it work xplat)

- Split the WindowsCheckSparseness test to a new file, since it's windows specific, and relies on Windows Interop APIs
- Implement new and improved logic, which saves 1 syscall in the ideal case (described in a comment on dotnet#88695)
- Fix missing DELETE access when creating the destination file, in case it needs to be deleted
- Add overload of GetFinalPathNameByHandle, which does creates a string and returns it
- Fix a missing throw statement in GetVolumePathName's impl
- Changes to make tests compile
- Remove unneeded files for tests
//return false;
}

// Check we haven't opened an alternate data stream - this may cause a BSOD on Windows versions lower than TBD (todo) if we don't exit before block copy.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to confirm which version of windows at some point.

@hamarb123
Copy link
Contributor Author

hamarb123 commented Jul 16, 2023

Changes to logic in a7a74a0 (maked by strikeout and (new))
  • Open handle to source file, and read its attributes (included checking for ADS) (1 syscall saved here in general exit case).
  • Read the volume information from the source file to check that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING (2 syscalls up to here, this is where it would probably fail if not possible).
  • Open handle to the destination file, check it's not an ADS, creating it if needed.
  • Check destination file's volume is probably the same (I don't see a cheap way to confirm this for sure, and it will fail later in a sensible way anyway if it's different, so I'm not worried about it), and that it supports FILE_SUPPORTS_BLOCK_REFCOUNTING.
  • (new) Read the source file's name (not elidable), and check it's not an ADS
  • (new) Read the destination file's name (elidable on Windows version TBD+), and check it's not an ADS
  • (new? I think this was here before?) Get the volume of the source file (from the filename we read, we can't use the one we're given in case it follows a symlink)
  • Read the cluster size on the source volume.
  • Ensure the destination file's size is 0.
  • Make sure the destination file is not readonly.
  • Make sure the destination file is sparse.
  • Copy integrity information.
  • (new) Read the attributes of the source file.
  • Set the file size of the destination file.
  • Copy all of the blocks.
  • Match the source sparseness.
  • Update attributes and times to match the source.

@hamarb123
Copy link
Contributor Author

hamarb123 commented Jul 16, 2023

I think I'm personally pretty happy with the overall logic now (there are probably still issues here and there to fix, but I mean the general way it works). We also should work out how we want to handle errors properly (ignoring the silly throw statements for now still); most errors are handled by return false / throw, we should work out if we would want any of these to be different. There are also some todos to resolve.

if (!Interop.Kernel32.GetVolumeInformationByHandle(sourceHandle, null, 0, &sourceSerialNumber, null, &sourceVolumeFlags, null, 0))
{
throw new Exception("C");
//return false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the final code where you might want to trace the result (or why it could not clone) consider changing return type to string? where null means cloned successfully, non-null means clone failure, and the non-null string is the operation that failed.

@danmoseley
Copy link
Member

Thanks for continuing to work on this @hamarb123 . Given how complex the logic has to be, and this is on the path of file copying, and the bluescreen to work around -- I'm thinking this is probably too risky to try to rush into .NET 8 (which will finally fork off main in about 1 month). It does seem worth continuing though, so we can get it into early .NET 9 previews.

null,
0,
out _,
IntPtr.Zero))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder for future me: could we use asynchronous IO here to copy all the sections simultaneously?

@ghost ghost closed this Sep 25, 2023
@ghost
Copy link

ghost commented Sep 25, 2023

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@danmoseley
Copy link
Member

lost track of this one. @hamarb123 is it still something you're interested in continuing?

I imagine dev drive usage will become more mainstream during .NET 9 development.

@hamarb123
Copy link
Contributor Author

hamarb123 commented Sep 25, 2023

@hamarb123 is it still something you're interested in continuing?

Yes, not sure how much time I will be able to spend on this over the next month and a bit though (exams), but after that there shouldn't be any problem with that.

Can you please re-open the PR :)

@danmoseley danmoseley reopened this Sep 25, 2023
@danmoseley
Copy link
Member

reopened (I think you can do this too actually)

yes, certainly this can wait until after exams! good luck.

@hamarb123
Copy link
Contributor Author

reopened (I think you can do this too actually)

It doesn't let me reopen it when the bot closes it.

yes, certainly this can wait until after exams! good luck.

Thanks! :)

@ghost ghost closed this Oct 26, 2023
@ghost
Copy link

ghost commented Oct 26, 2023

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@ghost ghost locked as resolved and limited conversation to collaborators Nov 25, 2023
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support file block copy on write on Windows
8 participants