-
-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dirwalk ignores non-regular files #1727
Conversation
287f7c2
to
a33b624
Compare
a33b624
to
2090f7d
Compare
…ned instead. That way, algorithms relying on dirwalking can still see them if they want to, but would have a hard time to use them (accidentally). Note that this replaces the `From` implementation with `entry::Kind::try_from_file_type()`, which makes this a breaking change.`
2090f7d
to
a49c960
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, and gix clean
benefits for sure.
In particular, as shown in this gist, this makes it so that when gix clean
directly sees a FIFO (named pipe) on a Unix-like system, it disregards it as git clean
does, rather than deleting it.
Before this change:
ek in 🌐 catenary in gitoxide on main is 📦 v0.39.0 via 🦀 v1.83.0
❯ cargo run --bin=gix -- clean -n
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.77s
Running `target/debug/gix clean -n`
WOULD remove pipe
Skipped 1 expendable entry - show with -x
After this change:
ek in 🌐 catenary in gitoxide on main is 📦 v0.39.0 via 🦀 v1.83.0 took 14s
❯ cargo run --bin=gix -- clean -n
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.73s
Running `target/debug/gix clean -n`
Nothing to clean (Skipped 1 expendable entry - show with -x)
There are some things relevant to #1629 and conceptually relevant to this PR and its goals that are not covered here, but that may be out of scope here. Some may even be outside the broader originally intended scope of #1629 itself. (In the following, I always mean the same thing by "FIFO" and "named pipe," leaning toward using the latter when talking about Windows only because it is the term used in the Microsoft documentation.)
-
While the bug this fixes is not specific to the way
cargo
usesgix
, the behavior ofcargo publish
was the stated motivation for #1629, as well as possibly being the main motivation for this fix, going by how #1695 classifies it. But to fix the behavior ofcargo publish
in the presence of FIFOs, it may be necessary forcargo
to make more changes than just using a newergix-dir
:- It is possible to publish a crate from a directory that is not a Git repository, and when that is done, the bug still occurs, because the separate directory walking logic
cargo
implements also opens non-regular files including FIFOs. - In general,
cargo publish
seems to attempt to archive non-regular files, other than directories, as regular files, by opening them for read. This includes even symlinks, which it seems to dereference. (This also includes symlinks that resolve to non-regular files including FIFOs--when I do a dry-run publish of a crate that has a symlink to a FIFO, it blocks reading from the FIFO.)
- It is possible to publish a crate from a directory that is not a Git repository, and when that is done, the bug still occurs, because the separate directory walking logic
-
Both Git and gitoxide attempt to read from
.gitignore
even when it is a FIFO, in at least some situations. I'm not totally sure what the preferred behavior is for that, since it's a very strange situation and maybe there are use cases for.gitignore
files to be kinds of filesystem entries thatgit
cannot track. But it seems to me that blocking indefinitely should be avoided even here, and that this is a bug in Git and gitoxide.git status
andgit clean -n
block in a repository where.gitignore
is a FIFO.gix status
andgix clean -n
blocked in such a repository prior to the fix in this PR. Now, when there are no untracked files,gix status
no longer blocks (which surprised me), butgix clean -n
still does. But when there is a new unstaged regular file in addition to the FIFO, bothgix status
andgix clean -n
still block. Both before and after this PR,when aI have to press Ctrl+C only once withgix
command blocks, it blocks twice, i.e., it tries to open the.gitignore
FIFO for read twice instead of just once, while in contrastgit
commands block just once. Or, to state it more carefully,git
but twice withgix
,which I think, but do not know for sure, is due to the entry being opened again. -
The tests here are rightly skipped on Windows. The reason this is the right thing to do, at least as they are currently written, is that they are all testing with FIFOs, and they use a fixture whose creation script uses the
mkfifo
command.Windows does have named pipes, and a
mkfifo
command does exist in Git Bash and other MSYS2 environments. But the MSYS2mkfifo
command does not actually create named pipes! Instead, it creates.lnk
shortcut files that MSYS2 programs--programs that link tomsys-2.0.dll
and use it for POSIX system call emulation--treat as named pipes. I think Cygwin programs--programs that usecygwin1.dll
--can also use them. Native Windows programs just see them.lnk
shortcut files--which, at least typically, cannot be dereferenced and, in any case, are not named pipes.Although Git for Windows ships a MSYS2 environment, it is a native Windows program. More precisely:
- Although
git
somewhat misleadingly places its Windows-specific implementations of various functions that are natively available on Unix-like systems in amingw.c
file, this is really an implementation detail of a native Windows program. - All directly Git-related binaries shipped in Git for Windows--the
git.exe
wrapper incmd/
as well as the more importantgit.exe
in a directory usually calledmingw64/bin/
ormingw32/bin/
that it wraps, as well as all binaries inside thegit-core
directory whose locationgit --exec-path
gives--are native Windows programs rather than MSYS2 programs. - Interpreters such as
bash
andperl
are MSYS2 programs, however, so parts of Git that are implemented as scripts for them are exceptions in that they are effectively non-native. (Likewise, scripts provided by the user, such as hooks, if they are shell scripts, will also run in interpreters that perform POSIX functions usingmsys-2.0.dll
.)
The
git
command on Windows therefore--and I very much believe intentionally and rightly--gives no special semantics to any.lnk
files, even if they happen to be treated specially by MSYS2. If one is to create named pipes on Windows for tests, it cannot be done withmkfifo
in Git Bash or another MSYS2 or (as far as I know) any other Cygwin-like environment. - Although
-
At least with respect to FIFOs (named pipes), it is possible that the issue this is fixing should be considered not to affect Windows. My view is that probably we should consider Windows as affected and eventually try to test for the much weirder and harder to produce situation with named pipes that appear in a repository on Windows. But this is not obvious, and in order to decide, it may be necessary to consider this alongside other kinds of unusual filesystem entries on Windows, some of which are entirely Windows-specific.
Unlike on Unix-like systems where they can be created just about anywhere (which the MSYS2
mkfifo
situation with specially treated.lnk
is meant to emulate), on Windows named pipes are only allowed to exist in a named pipe filesystem. On Windows, named pipes live under\\.\pipe\
or, more broadly, under paths of the form\ServerName\pipe\
. They are not directly present elsewhere.Such paths resist being exposed in a way that allows a Git repository to include them. For example, a directory symlink can point to
\\.\pipe\
and a file symlink can point to any particular named pipe, and these symlinks are valid and dereferenceable. But symlinks in repository working trees--regardless of whether named pipes are involved and regardless of whether the symlinks point outside the volume that holds them--do not cause their target entries to be part of a Git repository.When an NTFS junction (see #1354 (comment)), which can usually be viewed as similar to a recursive bind mount on Unix-like systems that support them (
mount --rbind
), is present in a working tree, Git does deference it and regard the contents of the junctioned-to location as part of the repository. However, junctions do not work across filesystems; one cannot make a junction from an ordinary volume to a path like\\.\pipe\
(nor to an individual pipe, though that is also because they can only point to directory-like entries).It is nonetheless possible to make this elusive scenario occur, so named pipes (not named in a way that makes them special to git--this is independent of the
.gitignore
behavior described above) are seen as part of a repository, and sogit
commands even try to read form them and block. The way I have found to make this happen is to place a junction in a repository working tree that points to a directory symlink on the same volume, where the directory symlink itself points to\\.\pipe\
. I would guess that there are other ways.By the way, if performing this experiment, it is probably best to use a virtual machine that is not being used for anything else, since attempts are made to read from a bunch of named pipes on the system--anything under
\\.\pipe\
may be read from!--including some where weird things happen. However, this is an area where I did not follow my own advice. The main effect I noticed, and reproduced, was that all consoles that were started before the experiment failed to be recognized as consoles by native Windows programs afterwards that check device type--for example, I could not runvim
except in a new console--even though they seemed otherwise to continue working.One reason I mention the risk is to clarify what I mean when I say the situation is hard to produce. The experiment, as I performed it, is not too hard to set up, having figured out how. But it might be tricky to find an approach that wouldn't risk subtle havoc when run in automated tests.
(I found that the behavior of MSYS2 programs when real named pipes are exposed to them using this technique is also very interesting, but I omit it here for, uh, brevity?)
There are a few more gix clean
related things I plan to try. I'll open issues for any bugs I do find (if any), and also probably for the situation when .gitignore
is a FIFO.
Edit: I've opened #1729 about a new gix clean
panic.
Thanks so much for sharing!
That's true! There are non-git codepaths and probably they would fall for named pipes, even though I didn't validate it. Interestingly I am not aware of an existing issue in Cargo even, so it seems to be rare.
What happens is that
That was a very interesting read, even though I kept thinking that Windows seems to be quite broken if sideeffects like these are even possible. My feeling is that something in the console might have started reading, and blocking on, the named pipe, and now isn't able to respond to events anymore that help classify it as console. It's a denial of service for the console in question, probably a primtiive for further exploitation.
😁
Wonderful - this actually tells me that the natural behaviour of ignoring these named pipe entries that I just accepted for my convenience isn't actually correct, and that they should be officially pruned after all. Then all other tools should handle them correctly as well. |
Supersedes #1629.
Let's try to keep the change minimal, but consider what's in the superseded PR.
Tasks