Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to **whitelist** specific file extensions for listing with '--files' #333

Closed
n00bmind opened this issue Jan 18, 2017 · 33 comments
Closed
Labels
question An issue that is lacking clarity on one or more points.

Comments

@n00bmind
Copy link

n00bmind commented Jan 18, 2017

Hi.
I'm using ripgrep primarily inside vim/ctrlp.
I'm currently working on a very big repository (tens of thousands of files) with a, let's say "relaxed" policy about what should be in version control and what should not, meaning I have to filter out a lot of binaries, build byproducts and whatnot.
I'm of course using a .ignore file at the root, which helps with speeding up the listings, but I need an additional layer of filtering to weed out so much "noise", so I'm trying to find an adequate syntax to use for the 'ctrlp_user_command' variable in vim which would whitelist just what I'm interested in (only certain types of source files).
So far, I've obtained the best results by using the 'type-add' switch using the curly braces notation to include several file extensions, like this:
rg . --files --type-add "source:*.{h,cs,c}" -tsource

This almost works, I can see most of the files listed have the extensions I specified, but there's also many unwanted files, like these two for instance:
tools\webscarab\src\org\owasp\webscarab\plugin\scripted\script.bsh
tools\win32\python27\Lib\site-packages\data\themes\tools\icons48.code.tga

For the first one I have no explanation. The second one would seem as if the '.code' part at the end was matching my '.c'.. Wild speculation, of course.

Any ideas?

@BurntSushi
Copy link
Owner

What happens if you try:

rg . --files --type-add "source:*.{h,cs,c}" -tsource --debug

and

$ rg . --files -g '*.{h,cs,c}' --debug

@BurntSushi
Copy link
Owner

Note that the --debug flag might cause a lot of output to stderr. If you can stick that in a gist or a pastebin, that would be great.

@BurntSushi BurntSushi added the question An issue that is lacking clarity on one or more points. label Jan 18, 2017
@n00bmind
Copy link
Author

I don't feel too comfortable sharing details about this particular project, so I'll try to find a suitable example and provide the info you requested..

@n00bmind
Copy link
Author

n00bmind commented Jan 18, 2017

Ok, I'm using the Android SDK for this..
https://gist.github.com/chopsueysensei/bd0d7908f8d1b628cbb47b33a9b551b5

(I hope you can see all of it, it's pretty long)
I can see that it whitelists several things like images, jars and other things.
I'd say that paths that include a '.' somewhere are causing some trouble..

@BurntSushi
Copy link
Owner

@chopsueysensei Could you please include the command you ran? Could you also tell me how to clone the repo you're searching?

@BurntSushi
Copy link
Owner

I need enough information to reproduce the problem.

@n00bmind
Copy link
Author

n00bmind commented Feb 22, 2017

The commands are the ones you asked me to run.
"rg_g_debug.txt" contains all the output from the command rg . --files -g '*.{h,cs,c}' --debug, while "rg_t_debug.txt" contains all the output from rg . --files --type-add "source:*.{h,cs,c}" -tsource --debug.

The tree is just my current Android SDK folder.

@BurntSushi
Copy link
Owner

The tree is just my current Android SDK folder.

Could you please tell me how to get it?

@n00bmind
Copy link
Author

Just install Android SDK anywhere in your HD.. maybe also open "SDK Manager.exe" located in the root folder and download a couple platform versions / optional components to add some more content to it..

@BurntSushi
Copy link
Owner

@chocolateboy I'm not on Windows. I've never used the Android SDK before. Can you please link me to some instructions on how to acquire it? I need to be able to reproduce your problem.

@n00bmind
Copy link
Author

https://developer.android.com/studio/index.html#downloads

However, if you're not on windows, you probably won't get the same output right?

@n00bmind
Copy link
Author

Download the one under 'get just the command line tools'.
It should be a matter of unzipping then running.

@cheater
Copy link

cheater commented Jan 26, 2018

The android sdk is a red herring. You should be able to just create a file called foo.code.tga which is eg an ascii file with C inside it (for example) and it should be able to instruct rg to not find it.

@cheater
Copy link

cheater commented Jan 26, 2018

(by C i mean C source of course)

imo the perfect resolution would be to add something like gnu find syntax for specifying file names. At least -path, -name, -ipath, and -iname as well as -not, -o, -a, -(, and -).

Or at least -ipath and -iname for starters.

@okdana
Copy link
Contributor

okdana commented Jan 26, 2018

  • rg's --glob and --iglob are effectively the same thing as find's -name/-path and -iname/-ipath (though the semantics* are slightly different obviously)
  • !-prefixed patterns are effectively the same thing as -not
  • as in find, multiple glob patterns are combined with an implicit -a by default

* The pattern functionality is mostly as described here: https://git-scm.com/docs/gitignore

@cheater
Copy link

cheater commented Jan 26, 2018

Then I guess the issue can be closed as resolved? With the caveat that the original reporter should be able to reopen it if they are unhappy with the features rg provides.

@n00bmind
Copy link
Author

Is --glob the same as -g?
In that case, I already tried that as commented in my original post, and it still had some issues. It almost worked, but some files gave false positives..
My intention when opening this was more in the direction of bug catching & fixing.. I since have not used vim again in large codebases, so cannot attest as to how rg behaves currently in that scenario.

@BurntSushi
Copy link
Owner

I am going to close this because it's not reproducible. If someone can come up with a contained example that uses something more accessible than the entire Android code base, then I can take a look and re-open this.

@alper
Copy link

alper commented Nov 12, 2020

Thte manpage says:

Only search files matching TYPE.

That does not help me to figure out what TYPE can be. I figured out in my case I have to say -tgo but that's fairly counterintuitive.

@BurntSushi
Copy link
Owner

@alper How is -tgo counter intuitive? Please consider reading the guide's section on filtering with file types.

@alper
Copy link

alper commented Nov 12, 2020

Oh cool. The guide has examples so that makes it a lot easier. The man page does not.

Values concatenated onto the flag is not something I see a lot? I would expect: -t go but that could be just me.

@BurntSushi
Copy link
Owner

Values concatenated onto the flag is not something I see a lot? I would expect: -t go but that could be just me.

It's standard and idiomatic in UNIX command line since... forever. I don't know precisely when the convention started, but it is specified by POSIX, which likely suggests the convention pre-dated POSIX. So it's probably been around for at least 32 years.

And -t go works as well.

@disconnect3d
Copy link

@BurntSushi Can we at least extend the "USAGE" displayed when rg is invoked with no arguments or with --help so that it shows the --type flag, like:

USAGE:
    rg [OPTIONS] PATTERN [PATH ...]
-    rg [OPTIONS] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
+    rg [OPTIONS] [--type TYPE ...] [-e PATTERN ...] [-f PATTERNFILE ...] [PATH ...]
    rg [OPTIONS] --files [PATH ...]
    rg [OPTIONS] --type-list
    command | rg [OPTIONS] PATTERN

Ideally, it would be nice to provide an example like --type markdown but yeah.

@BurntSushi
Copy link
Owner

That's not really what the usage is for. The usage is show the forms of allowable commands, not just prominent flags. Notice how the second form indicates that the only positional arguments are file paths, where as the first form indicates that the first positional argument is a pattern.

@disconnect3d
Copy link

disconnect3d commented Mar 26, 2024

That's not really what the usage is for. (...)

Yet, the usage/help fails to immediately show/explain to the user how to filter by filepaths/extensions.

I am pretty sure that many many people had the same issue and were annoyed that the usage doesn't show/explain -g or -t, but oh well.

It would be nice to address this somehow.

EDIT: I mean, sure, its in --help, but I still feel its hard to discover it. Random thought: ppl may not search for 'glob' (or know what it does) and instead look for 'filepath', 'file extension' etc.

@BurntSushi
Copy link
Owner

I am pretty sure that many many people had the same issue and were annoyed that the usage doesn't show/explain -g or -t, but oh well.

There are a ton of things it doesn't show.

The --help page is very long, and if you start prioritizing things to fix issues like, then you end up with the opposite problem: things that really do need to be prioritized end up getting de-prioritized. Therefore, your suggestion is not just "please prioritize this important feature," but it's also "please also de-prioritize this other thing at the same time." In other words, a balance must be struct.

The user guide has several prominent sections on filtering. I think that's probably good enough IMO.

@cheater
Copy link

cheater commented Mar 26, 2024 via email

@BurntSushi
Copy link
Owner

The first result for me is the GUIDE. And the GUIDE talks extensively about filtering. I don't understand how that isn't relevant. It is literally exactly the thing you would want to see.

@cheater
Copy link

cheater commented Mar 26, 2024 via email

@cheater
Copy link

cheater commented Mar 26, 2024 via email

@BurntSushi
Copy link
Owner

BurntSushi commented Mar 26, 2024

OK, well I don't control what snippet the search engine shows you.

I'm not splitting the guide into arbitrarily small pieces just so search engine results snippets are better.

The size of the document is large, which is why there is a table of contents. The table of contents is quick to scan and even includes the phrase "file types."

@cheater
Copy link

cheater commented Mar 26, 2024 via email

@BurntSushi
Copy link
Owner

I think we'll have to agree to disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question An issue that is lacking clarity on one or more points.
Projects
None yet
Development

No branches or pull requests

6 participants