Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change std::fs::remove_dir_all to be idempotent #410

Closed
tbu- opened this issue Jul 10, 2024 · 11 comments
Closed

Change std::fs::remove_dir_all to be idempotent #410

tbu- opened this issue Jul 10, 2024 · 11 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@tbu-
Copy link

tbu- commented Jul 10, 2024

Proposal

Problem statement

It is currently impossible to properly check for errors with std::fs::remove_dir_all, especially with concurrent calls to it. This doesn't have to be the case, as we can e.g. see in std::fs::create_dir_all which can be called multiple times concurrently and succeed.

The current documentation of std::fs::remove_dir_all currently guarantees that std::fs::remove_dir_all is not idempotent and will return an error if the directory does not exist.

Motivating examples or use cases

The Rust repository itself has skipped error checking for std::fs::remove_dir_all (as suggested by the documentation itself). We see snippets like the following all over the codebase:

let _ = fs::remove_dir_all(dir);

The problem is so bad that cargo-miri contains an incorrect version of the proposed fixed version:

https://github.com/rust-lang/rust/blob/d81987661a06ae8d49a5f014f81824c655e87768/src/tools/miri/cargo-miri/src/util.rs#L289-L298

/// An idempotent version of the stdlib's remove_dir_all
/// it is considered a success if the directory was not there.
fn remove_dir_all_idem(dir: &Path) -> std::io::Result<()> {
    match std::fs::remove_dir_all(dir) {
        Ok(_) => Ok(()),
        // If the directory doesn't exist, it is still a success.
        Err(err) if err.kind() == io::ErrorKind::NotFound => Ok(()),
        Err(err) => Err(err),
    }
}

This version doesn't account for the fact that these functions might return io::ErrorKind::NotFound also when any file deletion fails with this error code (on Unix-like systems) or when any of the parent directories of dir do not exist.

Note that these are just two examples from the rust repository itself, the whole ecosystem probably has more examples like this.

I believe that despite changing documented behavior, this will not break existing programs, since the end result of the directory not existing stays the same.

Solution sketch

Make concurrent calls to std::fs::remove_dir_all succeed. If the passed path contains more than one component, still fail with io::ErrorKind::NotFound when any component but the last one does not exist. I.e. fs::remove_dir_all("foo/bar") succeeds if foo exists but foo/bar does not. It'll fail if even foo does not exist.

Alternatives

The proposed solution cannot be written using existing APIs, except by copying the implementation from the standard library and modifying it.

  1. Another possible solution would be to also not fail when parent directories of the passed path do not exist.

  2. A third possible solution would be to create a new function which would have the behavior described in this API change proposal. I don't see a use case for the original function so, that's why I didn't propose this.

Links and related work

Not aware of any.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@tbu- tbu- added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Jul 10, 2024
@ChrisDenton
Copy link
Member

To be ultra clear, this is proposing changing a documented behaviour.

@tbu-
Copy link
Author

tbu- commented Jul 10, 2024

Sorry, I should have mentioned that. I'll elaborate on it in the first post.

@tbu-
Copy link
Author

tbu- commented Jul 11, 2024

Oh, another alternative would obviously be to create a new function doing what the initial post describes. I added it to the first post.

@tgross35
Copy link

Crosslinking: discussion rust-lang/rust#127576, and a proposed fix rust-lang/rust#127623

@ChrisDenton
Copy link
Member

That's a different thing imho. It concerns the internal behaviour rather than the effect on the top-level directory.

@the8472
Copy link
Member

the8472 commented Jul 12, 2024

Not propagating internal not-found errors does seem like a decent solution because it solves

This version doesn't account for the fact that these functions might return io::ErrorKind::NotFound also when any file deletion fails with this error code (on Unix-like systems) or when any of the parent directories of dir do not exist.

@ChrisDenton
Copy link
Member

On Windows we ignore not found errors when trying to delete files (and I think we pretty much have to if we want to avoid spurious errors). However, we don't do that for directories.

@tgross35
Copy link

Ah right, I guess that this ACP is a superset of the change I linked.

They should probably be considered together since it seems changing rust-lang/rust#127576 resolves the part of this that can't easily be adjusted by the user. It is trivial to ignore the top-level error if that is the desired behavior.

@pitaj
Copy link

pitaj commented Jul 12, 2024

Some prior discussion in #170

@tbu-
Copy link
Author

tbu- commented Jul 13, 2024

Another relevant thread: rust-lang/rust#105745.

@Amanieu
Copy link
Member

Amanieu commented Jul 20, 2024

We discussed this in last week's libs-api meeting. The consensus was to:

  • Continue reporting an error if the root directory didn't exist at the time remove_dir_all was called. This happens on the first read_dir on that directory.
  • Ignore ErrorKind::NotFound errors when deleting or read_diring any files or directories that were found through read_dir.
  • Ignore ErrorKind::NotFound errors when deleting the root directory after everything in it has been deleted.

This is essentially the solution implemented in rust-lang/rust#127623.

While we appreciate that it can be desirable for remove_dir_all to be idempotent, we don't think that this is worth a compatibility break. This is particularly concerning since this is the kind of breaking change for which we can't use crater to evaluate the impact on the ecosystem.

@Amanieu Amanieu closed this as completed Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

6 participants