-
-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] 🐌 fd can be much slower than GNU find in some cases #1614
Comments
It's not too surprising that $ fd . --max-depth=1 --type=executable --format '{/}' |
@tavianator thanks! It's still slower, but the gap is smaller $ hyperfine -w2 -n "BSD find" "FIND_PROG=find FIND_ARGS='-maxdepth 1 -perm +111 -type f,l -exec basename {} ;' fzf-menu --list-programs" \
-n "GNU find" "FIND_PROG=gfind FIND_ARGS='-maxdepth 1 -executable -printf %f\n' fzf-menu --list-programs" \
-n fd "FIND_PROG='fd .' FIND_ARGS='--max-depth=1 --type=executable --format \"{\}\" ' fzf-menu --list-programs"
Benchmark 1: BSD find
Time (mean ± σ): 5.074 s ± 0.077 s [User: 0.806 s, System: 2.332 s]
Range (min … max): 4.988 s … 5.203 s 10 runs
Benchmark 2: GNU find
Time (mean ± σ): 138.9 ms ± 4.6 ms [User: 32.0 ms, System: 69.9 ms]
Range (min … max): 130.2 ms … 151.2 ms 21 runs
Benchmark 3: fd
Time (mean ± σ): 249.2 ms ± 8.4 ms [User: 77.3 ms, System: 108.8 ms]
Range (min … max): 235.4 ms … 262.9 ms 11 runs
Summary
GNU find ran
1.79 ± 0.08 times faster than fd
36.52 ± 1.32 times faster than BSD find |
I see |
Thanks! I had $ hyperfine -w2 -n "BSD find" "FIND_PROG=find FIND_ARGS='-maxdepth 1 -perm +111 -type f,l -exec basename {} ;' fzf-menu --list-programs" \
-n "GNU find" "FIND_PROG=gfind FIND_ARGS='-maxdepth 1 -executable -type f,l -printf %f\n' fzf-menu --list-programs" \
-n fd "FIND_PROG='fd .' FIND_ARGS='--max-depth=1 --type=executable --format \"{/}\" ' fzf-menu --list-programs"
Benchmark 1: BSD find
Time (mean ± σ): 5.236 s ± 0.251 s [User: 0.848 s, System: 2.431 s]
Range (min … max): 4.899 s … 5.689 s 10 runs
Benchmark 2: GNU find
Time (mean ± σ): 156.6 ms ± 5.2 ms [User: 35.3 ms, System: 79.0 ms]
Range (min … max): 147.7 ms … 171.2 ms 18 runs
Benchmark 3: fd
Time (mean ± σ): 259.8 ms ± 7.9 ms [User: 82.7 ms, System: 111.5 ms]
Range (min … max): 248.4 ms … 276.1 ms 11 runs
Summary
GNU find ran
1.66 ± 0.07 times faster than fd
33.43 ± 1.95 times faster than BSD find |
@tavianator are you able to reproduce sharkdp/fd-benchmarks#5 ? |
My results look like this: $ hyperfine -w2 \
'fd . $(tr ":" " " <<<"$PATH") --follow --max-depth=1 --type=executable --format "{/}"' \
{find,bfs}' -L $(tr ":" " " <<<"$PATH") -maxdepth 1 -type f -executable -printf "%f\n"'
Benchmark 1: fd . $(tr ":" " " <<<"$PATH") --follow --max-depth=1 --type=executable --format "{/}"
Time (mean ± σ): 18.3 ms ± 1.1 ms [User: 29.1 ms, System: 39.0 ms]
Range (min … max): 16.1 ms … 20.9 ms 163 runs
Benchmark 2: find -L $(tr ":" " " <<<"$PATH") -maxdepth 1 -type f -executable -printf "%f\n"
Time (mean ± σ): 20.3 ms ± 1.9 ms [User: 3.6 ms, System: 16.7 ms]
Range (min … max): 18.9 ms … 36.0 ms 138 runs
Benchmark 3: bfs -L $(tr ":" " " <<<"$PATH") -maxdepth 1 -type f -executable -printf "%f\n"
Time (mean ± σ): 24.7 ms ± 0.4 ms [User: 10.1 ms, System: 89.5 ms]
Range (min … max): 23.7 ms … 25.9 ms 116 runs
Summary
fd . $(tr ":" " " <<<"$PATH") --follow --max-depth=1 --type=executable --format "{/}" ran
1.11 ± 0.12 times faster than find -L $(tr ":" " " <<<"$PATH") -maxdepth 1 -type f -executable -printf "%f\n"
1.35 ± 0.08 times faster than bfs -L $(tr ":" " " <<<"$PATH") -maxdepth 1 -type f -executable -printf "%f\n" What does a similar benchmark look like for you? It's possible that |
I had a short look at this and could reproduce your results (I guess it depends a lot on the number of directories in PATH). I'm currently away from a machine to test this but the way you use fd here is probably the worst case. You're basically just listing (executable) files in a single directorywithout traversing anything. These two facts together essentially mean that we can't make use of parallelism. In this case, it's no surprise that fd is slower. This is a case we haven't optimized for. You could try with --threads=1. Maybe that makes fd a bit faster, but I wouldn't be surprised if it's still slower than find. But what I would rather do (if you want to optimize the whole thing) is to change that shell function. It shouldn't spawn the find program N times, once for each directory in PATH. Instead, it should pass all N directories to a single instance of the find program. Edit: I saw only now that this is exactly what @tavianator did in his benchmark. |
Indeed, I just tried @vegerot's tavianator@tachyon $ hyperfine -N -w2 "fd -1" "find -quit"
Benchmark 1: fd -1
Time (mean ± σ): 5.4 ms ± 0.4 ms [User: 2.0 ms, System: 8.9 ms]
Range (min … max): 4.8 ms … 7.3 ms 457 runs
Benchmark 2: find -quit
Time (mean ± σ): 636.9 µs ± 69.3 µs [User: 484.7 µs, System: 78.8 µs]
Range (min … max): 563.1 µs … 1473.5 µs 3548 runs
Summary
find -quit ran
8.47 ± 1.11 times faster than fd -1 5 ms doesn't usually matter but when you run If I fix local paths=$(tr ':' '\n' <<<"$PATH" | awk '!x[$0]++')
$find_prog $paths $find_args 2>/dev/null \
| awk '!x[$0]++' I get these results: Benchmark 1: FIND_PROG='find -L' FIND_ARGS='-maxdepth 1 -executable -type f,l -printf %f\n' get_programs_in_path
Time (mean ± σ): 22.4 ms ± 0.8 ms [User: 8.1 ms, System: 19.9 ms]
Range (min … max): 21.3 ms … 26.8 ms 125 runs
Benchmark 2: FIND_PROG='fd .' FIND_ARGS='--hidden --max-depth=1 --type=executable --follow --format {/}' get_programs_in_path
Time (mean ± σ): 20.5 ms ± 1.1 ms [User: 34.3 ms, System: 43.7 ms]
Range (min … max): 18.5 ms … 23.3 ms 144 runs
Summary
FIND_PROG='fd .' FIND_ARGS='--hidden --max-depth=1 --type=executable --follow --format {/}' get_programs_in_path ran
1.09 ± 0.07 times faster than FIND_PROG='find -L' FIND_ARGS='-maxdepth 1 -executable -type f,l -printf %f\n' get_programs_in_path See #1203, #1362, #1408, #1412, #1414, #1422, #1431 for some previous discussion and optimization of startup time. |
A quick analysis with |
Checks
Describe the bug you encountered:
(I'm not sure if we count performance issues as a bug or not)
See sharkdp/fd-benchmarks#5 for repro
I am building the world's fastest application launcher for GNU+Linux. I thought
fd
would be a good choice for finding applications, but after benchmarking I found that GNU find can be 9 times faster than fd. Am I usingfd
wrong, or in this case is GNU find faster?Describe what you expected to happen:
Reproduction steps:
Run
./warm-cache-exe-paths.sh
in sharkdp/fd-benchmarks#51. download https://github.com/vegerot/dotfiles/blob/a9a50230d808572173d3eeec057739d0fe8d4470/bin/fzf-menu2. Run(note: this is on macOS. I reprod on GNU+Linux as well but didn't measure BSD find)
Expected:
fd
to be the fastest.Actual:
What version of
fd
are you using?fd 10.2.0
Which operating system / distribution are you on?
Is there a way I can use
fd
that will make it faster than GNU find, or for my particular application is GNU find just better?Please see sharkdp/fd-benchmarks#5 for repro
The text was updated successfully, but these errors were encountered: