Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This replaces the serial rename based implementation with a parallel work-stealing implementation, which for large datasets with large core-count CPUs shows substantial improvements. For small datasets or single core CPUs the performance should be similar to the prior implementation, though with a different success tradeoff in the corner case of in-use files. Specifically, in-use files will now error rather than remaining in the parent directory indefinitely. The tradeoff is that this function can now delete a directory in a parent directory when no other write permissions to the parent directory exist, which was previously impossible. Generally speaking, being able to remove all files isn't guaranteed, and even being able to rename all files isn't guaranteed - permissions may prevent deleting files on unix platforms, and exclusive locks can prevent opening and renaming files. Applications will typically have control over whether the files in the directories they are working on are in use (e.g. they can arrange for a signal to kill off processes), so this tradeoff, which allows working with more restrictive permissions on containing directories, allows slightly more broad use cases and is faster due to less work required overall. The existing implementation could be parallelised too of course, or potentially we could do both; but this seems like a solid place to be. Rough benchmark based on use in rustup: before: time rustup toolchain remove stable info: uninstalling toolchain 'stable-x86_64-pc-windows-msvc' info: toolchain 'stable-x86_64-pc-windows-msvc' uninstalled real 0m24.470s user 0m0.000s sys 0m0.047s after: time RUSTUP_FORCE_ARG0=rustup ./target/release/rustup-init.exe toolchain remove stable info: uninstalling toolchain 'stable-x86_64-pc-windows-msvc' info: toolchain 'stable-x86_64-pc-windows-msvc' uninstalled real 0m2.560s user 0m0.015s sys 0m0.000s
- Loading branch information