Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rm: r4_gnu_test #6621

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

AnirbanHalder654322
Copy link
Contributor

Added regex detection to various combination of ".." and "." followed by slashes .

@AnirbanHalder654322 AnirbanHalder654322 changed the title rm: rm_4_gnu_test rm: r4_gnu_test Aug 7, 2024
Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

1 similar comment
Copy link

github-actions bot commented Aug 7, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@just-an-engineer
Copy link
Contributor

Why not make it work on Windows as well?

@just-an-engineer
Copy link
Contributor

#6620

src/uu/rm/Cargo.toml Outdated Show resolved Hide resolved
src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
tests/by-util/test_rm.rs Outdated Show resolved Hide resolved
tests/by-util/test_rm.rs Show resolved Hide resolved
Copy link
Sponsor Contributor

@sylvestre sylvestre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs some updates

@AnirbanHalder654322
Copy link
Contributor Author

needs some updates

Sorry for the lack of communication, had some assignment deadlines to fulfill. I will push out updates today.

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/rm/r-4 is no longer failing!

@AnirbanHalder654322
Copy link
Contributor Author

@sylvestre This should be ready for a review now

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@just-an-engineer
Copy link
Contributor

just-an-engineer commented Aug 14, 2024

Is the utf-8 crate from Cargo.toml used? I could just be missing it. If not, can you remove it?

Also, for your rm.rs, lines 584 & 585, I imagine we only want to check for /. or /.. exactly. I might be wrong here, but I can do touch hi., making a file that ends in either . or .., in which case your code would currently classify it as the parent or current dir.
Also also, on lines 586 and 587, wouldn't you want it to be format!("{}.", ...) and similar to detect that?

I do like how you remove trailing slashes, but would you be able to propagate the new cleaned_path var on line 334 in rm.rs to the rest of that function (handle_dir)? I assume that would be nice to use for the rest since you've already processed it a bit. Also, would something like str.replace('//', '/') (and the related slash for Windows) work instead? (I'm just throwing out stuff, so if I'm wrong, then feel free to say so). I assume that will collapse double slashes both at the start, end, and middle. I assume the performance would be similar, but it might be more concise. But it looks like you just take a slice of the original string, so I assume yours doesn't work on multiple slashes in the middle of a path.

Lastly, for your tests, I think you could save yourself 30ish lines and just use the conditional variable assignments for *nix/Windows (/ vs \\) and just use a bunch of format all over the place. Although maybe it may look too cluttered, so perhaps what you have here is better than that method. Idk, @sylvestre, any thoughts on that? One worry would be drift if someone updates a test but forgets to do the other, but they're smallish and close to each other.

EDIT: I realize your function is for trailing slashes, not necessarily double. You could always turn it into a more general cleaning function with the replace that I mentioned, and remove trailing, but I just misread it, my apologies. But instead of a loop at line 633, wouldn't you be able to use a built-in method for the path to get the first index and use that directly in the path slice?

@AnirbanHalder654322
Copy link
Contributor Author

Is the utf-8 crate from Cargo.toml used? I could just be missing it. If not, can you remove it?

I believe my PR does remove it.

Also, for your rm.rs, lines 584 & 585, I imagine we only want to check for /. or /.. exactly. I might be wrong here, but I can do touch hi., making a file that ends in either . or .., in which case your code would currently classify it as the parent or current dir. Also also, on lines 586 and 587, wouldn't you want it to be format!("{}.", ...) and similar to detect that?

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

There is no point in doing format!("{}.") as ends_with('.') would cover the point of the condition anyway. And we would not replace the former with the latter since format!("{}.") wouldn't be true when you do rm -rf . or rm -rf ....

Lastly, for your tests, I think you could save yourself 30ish lines and just use the conditional variable assignments for *nix/Windows (/ vs \\) and just use a bunch of format all over the place. Although maybe it may look too cluttered, so perhaps what you have here is better than that method. Idk, @sylvestre, any thoughts on that? One worry would be drift if someone updates a test but forgets to do the other, but they're smallish and close to each other.

We usually just add new tests instead of changing old tests unless it asserts wrong behavior and therefore fail, the test would be caught in the CI if people forget to change anyway.

EDIT: I realize your function is for trailing slashes, not necessarily double. You could always turn it into a more general cleaning function with the replace that I mentioned, and remove trailing, but I just misread it, my apologies. But instead of a loop at line 633, wouldn't you be able to use a built-in method for the path to get the first index and use that directly in the path slice?

We only care about cleaning the end of the string from slashes to print to stderr. There is no point cleaning the middle, doing a stat() call on something like ..//../ will refer to the directory at ../../ anyway. We are only cleaning to be compatible with gnu.

Actually I was under the idea that converting windows utf-16 OsStr to rust str type might fail that's why the I was very hesitant to use strings.

@just-an-engineer
Copy link
Contributor

just-an-engineer commented Aug 14, 2024

I believe my PR does remove it.

My apologies, I just saw it an my mind went to add. But yeah, I see you're actually removing it

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

Yeeaah, I remember that now. However, on Linux, I can also make a directory that's named test. or test... Would you be able to add some tests to make sure a valid directory that ends with . or .. don't get improperly ignored? Because if it's the case that a real directory that ends with . or .. makes it more necessary to change how you check it. Because I agree if we didn't care about that, the ends_with function from your reply:

There is no point in doing format!("{}.") as ends_with('.') would cover the point of the condition anyway. And we would not replace the former with the latter since format!("{}.") wouldn't be true when you do rm -rf . or rm -rf ....

Would work just as well. But I think we need to go through the extra hassle of ensuring that's the 'entire' name of the program. There might be other ways to do that, like see if it ends with . and is of length 1 (outside of the rest of the path), but it may be simpler to do the format and keep most of the structure of what you have

We usually just add new tests instead of changing old tests unless it asserts wrong behavior and therefore fail, the test would be caught in the CI if people forget to change anyway.

Yeah, that's a good point. It may not cover the case of someone making a test stricter and forgetting the other one, but I think it's valid enough. And like I said, the test looks clean enough, doing a bunch of formats may make it look not so pretty.

Actually I was under the idea that converting windows utf-16 OsStr to rust str type might fail that's why the I was very hesitant to use strings.

I have no clue about that, so I'll just say that sounds good if so and leave it to you and Sylvestre.

@AnirbanHalder654322
Copy link
Contributor Author

The helper functions get called only when the stat() call yields a directory. So your file example won't really trigger the code.

Yeeaah, I remember that now. However, on Linux, I can also make a directory that's named test. or test... Would you be able to add some tests to make sure a valid directory that ends with . or .. don't get improperly ignored? Because if it's the case that a real directory that ends with . or .. makes it more necessary to change how you check it. Because I agree if we didn't care about that, the ends_with function from your reply:

Good catch, i will push a fix for that.

@AnirbanHalder654322
Copy link
Contributor Author

It seems like there is some niche behavior with GNU which does seem like a bug.

rm -rf ...
rm: refusing to remove '.' or '..' directory: skipping '../..'
rm -rf ....
rm: refusing to remove '.' or '..' directory: skipping '../../..' 

It seems like gnu converts the path into ../.. , seems they are processing it as overlapping instances of .. instead of disjoint instances of .. . Basically turning the path into the parent of parent directory in the case of rm -rf ... , parsing it as .. and .. which the 2nd . is overlapping. And same for .... , it essentially referring to k = (n!) / (2!*( n-2)!) , kth parent directory above the cwd in directory tree.

Linux allows both files and directories to be named ... or .... or combinations of ...... barring .. and . which refers to current dir and parent dir respectively. It feels a bit wrong to arbitrarily err on valid file names, which can be trivially created.

// Start up calls 

getrandom("\xd5\x1d\xd5\x6f\xc8\xd3\xcd\xe7", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x5d157286c000
brk(0x5d157288d000)                     = 0x5d157288d000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=15751120, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 15751120, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7abe77800000
close(3)                                = 0
ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0755, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "../..", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0 
// Seems like the path is converted in userspace and then a syscall is made to ``../..`` for ``rm -rf ...`` 

// User locale related syscalls 

 
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=613, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 613, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7abe78a82000
close(3)                                = 0
write(2, "rm: ", 4rm: )                     = 4
write(2, "refusing to remove '.' or '..' d"..., 58refusing to remove '.' or '..' directory: skipping '../..') = 58
write(2, "\n", 1
)                       = 1
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

@sylvestre Need some advice on how to handle this . Do we just do the same and read the path as overlapping instances of .. ?

@just-an-engineer
Copy link
Contributor

What setup are you on? I just made files and directories with a bunch of '.'s, and couldn't replicate it. For GNU, I mean. I'm on Mint 20.1, x86

@AnirbanHalder654322
Copy link
Contributor Author

What setup are you on? I just made files and directories with a bunch of '.'s, and couldn't replicate it. For GNU, I mean. I'm on Mint 20.1, x86

Have you checked the hidden files to see if they exist ? Doing std::fs::File::create should work. Try doing ls -a after creating the file.

@just-an-engineer
Copy link
Contributor

Yeah, I just ran it, making /tmp/tests/ my current directory, and referenced the rm binary by path, and got the same behavior

@AnirbanHalder654322
Copy link
Contributor Author

Yeah, I just ran it, making /tmp/tests/ my current directory, and referenced the rm binary by path, and got the same behavior

Well, i also installed coreutils 9.4 to check, it gives the same error. I will chalk it up to something on my machine which i can't figure out. A person on the discord has also confirmed few days ago they weren't facing any issues. So its very likely something on my machine which i cannot figure out.

Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

@just-an-engineer
Copy link
Contributor

Hey, is this still something you're working on?
If so, check out my PR (#6623), and try using std::path::MAIN_SEPARATOR that Ben showed me.
If not, I'll start updating my PR

@AnirbanHalder654322
Copy link
Contributor Author

Hey, is this still something you're working on? If so, check out my PR (#6623), and try using std::path::MAIN_SEPARATOR that Ben showed me. If not, I'll start updating my PR

Ideally this should just need a review but the conversation here became too long to extract relevant context. You can start updating your pr. Feel free to pull things from here.

Copy link
Collaborator

@BenWiederhake BenWiederhake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for tackling these issues! Can you fix the formatting issues and the crash? Bonus points if you also decide to handle non-utf8-filenames correctly, but I could understand if you decide that's out of scope.

src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
src/uu/rm/src/rm.rs Show resolved Hide resolved
src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
@just-an-engineer
Copy link
Contributor

Hey, is this still something you're working on? If so, check out my PR (#6623), and try using std::path::MAIN_SEPARATOR that Ben showed me. If not, I'll start updating my PR

Ideally this should just need a review but the conversation here became too long to extract relevant context. You can start updating your pr. Feel free to pull things from here.

Alright, but I may not be able to get to it for a day or two. But I do like how you clean trailing slashes, and I think your tests are more clean. I think if you fix the above things Ben mentioned, and make it agnostic to *nix/Windows, it'll be exactly what I was thinking

@AnirbanHalder654322
Copy link
Contributor Author

Hey, is this still something you're working on? If so, check out my PR (#6623), and try using std::path::MAIN_SEPARATOR that Ben showed me. If not, I'll start updating my PR

Ideally this should just need a review but the conversation here became too long to extract relevant context. You can start updating your pr. Feel free to pull things from here.

Alright, but I may not be able to get to it for a day or two. But I do like how you clean trailing slashes, and I think your tests are more clean. I think if you fix the above things Ben mentioned, and make it agnostic to *nix/Windows, it'll be exactly what I was thinking

Ok, i will work on it then.

Copy link

github-actions bot commented Sep 8, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

1 similar comment
Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link

github-actions bot commented Sep 9, 2024

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Copy link
Collaborator

@BenWiederhake BenWiederhake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now touches --no-preserve-root, and I'm not yet convinced that the code is correct. Are you sure you want to handle this in the same PR?

src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
src/uu/rm/src/rm.rs Outdated Show resolved Hide resolved
tests/by-util/test_rm.rs Outdated Show resolved Hide resolved
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/rm/r-4 is no longer failing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants