Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chained usage fd ... -X fd ... [-X] -- examples welcome in README? deeper support for chaining? #1450

Closed
1 of 2 tasks
mcint opened this issue Dec 12, 2023 · 4 comments
Closed
1 of 2 tasks
Labels

Comments

@mcint
Copy link

mcint commented Dec 12, 2023

Using fd version: fd 8.7.0

I find myself using fd in a chained manner.

Chained-style

For example, for quickly viewing python packages, today I find myself searching for
fd brotab -td ~/.local/lib/python3.10/ -X fd api -e py, since api is a common file and package name-component, and I'm just looking for the one today. I find myself wanting to run commands on the result, or chain a third (or more) times. I'm looking to document that pattern for other users of fd.

@mcint mcint added the question label Dec 12, 2023
@mcint
Copy link
Author

mcint commented Dec 12, 2023

I've rewritten this chained query in an extendable way (thought it suffers in keystroke cost):
<<< ~/.local/lib/python3.10/ xargs fd brotab -td | xargs fd api -e py.

Would a word-match flag PR be welcome? Something like grep's -w/--word-match. I understand that ^[pattern]$ can match a full component, but I would like a syntax that can be added on to a query. For my small example (which doesn't strongly justify this request):

  • the -w equivalent of start & end regex anchors:
$ <<< ~/.local/lib/python3.10/ xargs fd ^brotab$ -td -1
/home/mcint/.local/lib/python3.10/site-packages/brotab/

Where other searches, with early termination, -1, don't always yield what I want:

$ <<< ~/.local/lib/python3.10/ xargs fd brotab -td -1
/home/mcint/.local/lib/python3.10/site-packages/brotab-1.4.2.dist-info/

from this small set:

$ <<< ~/.local/lib/python3.10/ xargs fd brotab -td
/home/mcint/.local/lib/python3.10/site-packages/brotab/
/home/mcint/.local/lib/python3.10/site-packages/brotab-1.4.2.dist-info/

@tavianator
Copy link
Collaborator

For example, for quickly viewing python packages, today I find myself searching for
fd brotab -td ~/.local/lib/python3.10/ -X fd api -e py, since api is a common file and package name-component, and I'm just looking for the one today.

fd ... -X fd ... is not something that should be recommended. The main problem is it can drastically explode the result set:

$ fd foo
foo
foo/foo
foo/foo/foo
$ fd foo -X echo fd bar # To see what would be executed
fd bar ./foo ./foo/foo ./foo/foo/foo
$ fd foo -X fd bar # What would actually happen
./foo/bar
./foo/foo/bar
./foo/foo/foo/bar
./foo/foo/bar
./foo/foo/foo/bar
./foo/foo/foo/bar

You could pass --prune to the first fd to avoid this, but still I don't think we should recommend this pattern at all.

For your case, it's probably best to do all the filtering in the same fd command:

$ fd -td --full-path 'brotab.*api.*\.py$' ~/.local/lib/python3.10/

Would a word-match flag PR be welcome? Something like grep's -w/--word-match. I understand that ^[pattern]$ can match a full component, but I would like a syntax that can be added on to a query.

You can write word boundaries in the regex like this:

$ fd '\bpattern\b'

I don't think we'd add a flag to do this for you, fd already has too many flags :)

@mcint
Copy link
Author

mcint commented Dec 12, 2023

Hm, thank you, interesting suggestions.

I will consider --prune in my workflows, might try -P for that locally, and PR. Thank you!

It looks like, in practice, I can use -g/--glob, #692 (in place of my -w suggestion, #1450 (comment)).

Sounds like no objections to submitting other use examples for the readme or docs, might PR later.


Extraneous thinking aloud, about chaining queries

I've chewed on variations where I can keep appending [pattern] or [depth] [pattern] for a while.

To build the motivation a bit more, I query things like this:

  • fd -d3 -td [pkg] /
    • | xargs fd -d3 -td [lib]
      • | xargs fd -d3 -tf . -e ini

Compressed to: fd-chain -d3 [pkg] / -- -d3 [lib] -- -d3 -e ini .

Here are some real snippets of recent history, or for tasks I perform commonly:

fd -d4 ^php / -td | grep -ve -
fd -d4 ^php / -td | grep -ve - | xargs fd ini
fd -d4 ^php / -td | grep -ve - | xargs fd fpm
sudo apt install fzf
fd completion / -d4
fd completion / -d4 -X fd fzf -d4
fd completion / -d4 -td -X fd fzf -d4
fd fzf / -d4 -td -X fd completion -d4
. /usr/share/doc/fzf/examples/completion.bash
less /usr/share/doc/fzf/examples/completion.bash
. /usr/share/doc/fzf/examples/key-bindings.bash

Although, these examples each only use 2 steps.

Nit about full-path matching

fd ... -X fd ... is not something that should be recommended. The main problem is it can drastically explode the result set:

Thank you for a considered response, and I agree that blindly performing nested queries might blow up traversals & time required and results size. However, I must insist, full-path matching seems ill-advised, file systems have a really high branching factor, and searching them quickly and effortlessly (few keystrokes, forgiving argument order, concatentative/append-only use supported) is what makes fd such a delight to use. Full path matching makes this searching much more expensive. For argument's sake, model number of files as exponential in depth, 10^[D] files are present in D levels of fs tree. I've used fd on systems where -d4 returns in acceptable time, and -d5 takes a full minute or more. Chaining queries is quite useful, to limit the haystack size.

From painful experience, I can report that searching chained from partial matches helps a lot on low-resource systems.

Nested matching names are not entirely contrived, but requerying with a more limited depth, or now glob matching are what I'll try.

Fiddling with the shell cursor to modify queries is also frustrating in practice.

Thank you for your work maintaining -- answering random usage questions, and considering design space around the tool!

@mcint mcint closed this as completed Dec 13, 2023
@tavianator
Copy link
Collaborator

Nit about full-path matching

fd ... -X fd ... is not something that should be recommended. The main problem is it can drastically explode the result set:

Thank you for a considered response, and I agree that blindly performing nested queries might blow up traversals & time required and results size. However, I must insist, full-path matching seems ill-advised, file systems have a really high branching factor, and searching them quickly and effortlessly (few keystrokes, forgiving argument order, concatentative/append-only use supported) is what makes fd such a delight to use.

One thing that may help concatenative use is --search-path and --and, e.g.

$ fd --full-path --search-path ~/.local/lib/python3.10/ /brotab/ --and api -e py

Full path matching makes this searching much more expensive.

Does it? I see how it could, but I expect I/O and syscall overhead to dominate pattern matching. Let's check:

tavianator@tachyon $ hyperfine -w2 "fd -u brotab ~" "fd -u --full-path brotab ~"
Benchmark 1: fd -u brotab ~
  Time (mean ± σ):      1.151 s ±  0.014 s    [User: 18.505 s, System: 33.398 s]
  Range (min … max):    1.134 s …  1.180 s    10 runs
 
Benchmark 2: fd -u --full-path brotab ~
  Time (mean ± σ):      1.151 s ±  0.008 s    [User: 20.426 s, System: 32.466 s]
  Range (min … max):    1.142 s …  1.164 s    10 runs
 
Summary
  fd -u --full-path brotab ~ ran
    1.00 ± 0.01 times faster than fd -u brotab ~

And here's a more representative benchmark for your use case. I changed it up because I don't have any copies of brotab lying around.

tavianator@tachyon $ hyperfine "fd -u --search-path ~ --full-path /requests/ --and api -e py" "fd -u -td --prune --search-path ~ requests -X fd -u api -e py"
Benchmark 1: fd -u --search-path ~ --full-path /requests/ --and api -e py
  Time (mean ± σ):      1.126 s ±  0.014 s    [User: 14.427 s, System: 37.160 s]
  Range (min … max):    1.110 s …  1.149 s    10 runs
 
Benchmark 2: fd -u -td --prune --search-path ~ requests -X fd -u api -e py
  Time (mean ± σ):      1.156 s ±  0.012 s    [User: 16.962 s, System: 35.575 s]
  Range (min … max):    1.139 s …  1.181 s    10 runs
 
Summary
  fd -u --search-path ~ --full-path /requests/ --and api -e py ran
    1.03 ± 0.02 times faster than fd -u -td --prune --search-path ~ requests -X fd -u api -e py

Both queries return the same set of 110 files.

For argument's sake, model number of files as exponential in depth, 10^[D] files are present in D levels of fs tree. I've used fd on systems where -d4 returns in acceptable time, and -d5 takes a full minute or more. Chaining queries is quite useful, to limit the haystack size.

First off, you may be interested in #28 and possibly https://github.com/tavianator/bfs :)

Secondly, the total work is roughly the same for both approaches anyway. With one fd command, it has to explore the whole tree. With --prune ... -X fd ..., the parent fd explores the whole tree except under the brotab directories, and the child fd(s) explore just the brotab subtrees. In both cases, each path is examined by exactly one fd process. You just have more total processes with -X fd.

(Without --prune, -X fd does a lot more total work, because the parent fd is also searching the brotab trees along with the children.)

From painful experience, I can report that searching chained from partial matches helps a lot on low-resource systems.

I'm kind of surprised that -X fd chaining would ever be beneficial without --prune. I believe you, I'm just struggling to think of why that would happen.

Fiddling with the shell cursor to modify queries is also frustrating in practice.

True. One handy thing is most shells support Emacs-style keybindings for line editing, e.g. C-a (Ctrl+A) for beginning-of-line, C-e for end-of-line, M-b (Alt+B) to jump back a word, M-f to jump forward a word, etc. Often Ctrl+/ will work too. You can use vi-style keybindings instead with set -o vi too.

Thank you for your work maintaining -- answering random usage questions, and considering design space around the tool!

You're welcome! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants