Skip to content

Commit

Permalink
Threaded deletion on windows
Browse files Browse the repository at this point in the history
This replaces the serial rename based implementation with a
parallel work-stealing implementation, which for large datasets with
large core-count CPUs shows substantial improvements. For small datasets
or single core CPUs the performance should be similar to the prior
implementation, though with a different success tradeoff in the corner
case of in-use files.

Specifically, in-use files will now error rather than remaining in the
parent directory indefinitely. The tradeoff is that this function can
now delete a directory in a parent directory when no other write
permissions to the parent directory exist, which was previously
impossible.

Generally speaking, being able to remove all files isn't guaranteed, and
even being able to rename all files isn't guaranteed - permissions may
prevent deleting files on unix platforms, and exclusive locks can
prevent opening and renaming files. Applications will typically have
control over whether the files in the directories they are working on
are in use (e.g. they can arrange for a signal to kill off processes),
so this tradeoff, which allows working with more restrictive permissions
on containing directories, allows slightly more broad use cases and is
faster due to less work required overall.

The existing implementation could be parallelised too of course, or
potentially we could do both; but this seems like a solid place to be.

Rough benchmark based on use in rustup:

before:
time rustup toolchain remove stable
info: uninstalling toolchain 'stable-x86_64-pc-windows-msvc'
info: toolchain 'stable-x86_64-pc-windows-msvc' uninstalled

real    0m24.470s
user    0m0.000s
sys     0m0.047s
after:
time RUSTUP_FORCE_ARG0=rustup ./target/release/rustup-init.exe toolchain remove stable
info: uninstalling toolchain 'stable-x86_64-pc-windows-msvc'
info: toolchain 'stable-x86_64-pc-windows-msvc' uninstalled

real    0m2.560s
user    0m0.015s
sys     0m0.000s
  • Loading branch information
rbtcollins committed Aug 25, 2020
1 parent 81bd8cd commit 61c03eb
Show file tree
Hide file tree
Showing 4 changed files with 122 additions and 244 deletions.
11 changes: 11 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,18 @@ repository = "https://github.com/XAMPPRocky/remove_dir_all.git"
version = "0.5.3"

[target.'cfg(windows)'.dependencies]
log = "0.4.11"
num_cpus = "1.13"
rayon = "1.4"
winapi = {version = "0.3", features = ["std", "errhandlingapi", "winerror", "fileapi", "winbase"]}

[dev-dependencies]
doc-comment = "0.3"
env_logger = "0.7.1"
tempfile = "3.1"

# There is a circular dependency: tempfile depends on remove_dir_all, depends on
# tempfile for testing; so to be able to debug tests in remove_dir_all, we have
# to tell cargo that this remove_dir_all is canonical.
[patch.crates-io]
remove_dir_all = {path = "."}
Loading

0 comments on commit 61c03eb

Please sign in to comment.