-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit contiguous matches #131431
Comments
Hi @TylerLeonhardt , Is it really necessary to use “quotes” to identify that I prefer contiguous match even for single word searches? I mean, if I type a single word, I expect that any item that contains that exact match to have a higher rank, otherwise, the fuzzy search could do it’s magic. And, because you said you are not fully convinced about the UI, I would say, please don’t 😬 . You are probably aware of some of the issues related to the fuzzy search, but I remember one in particular, originally created by @kentcdodds on my Project Manager extension, which I asked to fill here (#14879). He suggested to use one of its libs https://github.com/kentcdodds/match-sorter, which seemed to work really great. I’m not saying this lib is the solution (I didn’t look all the libs details and how it would match on other pieces of VSCode) but because the issue was closed later in favor of another one, this history/details could be missed. Hope this helps |
Notes from standup:
I will take this to the UX sync. |
In this case, I expected the highlight to be 2021-5- 26. md instead, but yes, an edge case because Sorry but, personally, I feel the VS Code fuzzy search really needs improvement. Most of the times I rely on the recent items (when available) instead, because the search rank doesn’t work for me. I type almost the perfect match to find what I want, but still, sometimes I have to scroll down to select the expected result. I have memories of great results in Sublime (the first time I used a
I’m a folk that don’t like “quotes”, unless for phrase (full content) searches. Maybe it’s a background from other search engines I have used in the past. So, an “advanced search” option to toggle on/off would be a good alternative. Great to see you are working on this. Improvements are welcome 😁 Thank you |
The highlighting is wrong but the order is what I was trying to demonstrate. Also the |
That's great! Eager to try this one. |
Would be good to collect these cases, it is always useful to have a collection of less than ideal ranking so that we can understand why that is and if we make a fix see if our test cases are still good. Over the time, whenever I made changes to the scorer and ranking, I tried to write a test case for that scenario so that when we make changes we see what other cases fall apart (if any). Btw, unfortunately there is not just 1 fuzzy ranking/scorer algorithm used, depending on what quick open you are talking about:
I think a first good step would be to align the various pickers to use the same fuzzy scoring, except maybe command palette: we try to preserve muscle memory of users and not break it. We put commands in alphabetic order when showing results to group commands that logically belong together close to each other. Some fuzzy ranker might break this muscle memory easily. |
Totally agree, I'll try to replicate/identify those cases and if don't find an already open issue to add comments, I'll add create a new one with the details. Just to complement my previous comment about comparisons with other tools I used before, I wasn't saying Atom (with mixed results) had better results than VS Code. Only Sublime was better back then. On the other hand, also talking about other tools, I have Jetbrains Rider available in the company I'm working on, and I would say VS Code fuzzy search is way better. I see myself using VS Code instead quite often, because I feel it is much easier to search/navigate the source, congrats 👏 . Thank you |
@bpasero isn't there a different implementation for symbols too? Or is that covered in your list somewhere? |
Yeah. I'm one of those users. My workflow is that I tend to type 3 (sometimes 4) characters of distinct words in the files I'm look for. Here's an example where what I really want is Sorry the screenshots are going to be limited. I don't want to show too much of my directory structure. As you can see, the 7th item on the list is what I really want. I assume matches in the filename are preferred over directory path, but does finding a "u", "s", and "e" or "d", "at", and "a" randomly in the filename really help that much? That's a really cool thing VSCode does in autocompletion, but not so cool to me in the file open dialog. I could see maybe allowing a one character difference in case of a typo. If I type the "r" in "user", the situation is better, but still not great: So for me, allowing quotes was a way to get around those issues without breaking anyone else's workflow (e.g. by completely disabling it). However, if there was a setting that said file opening would default to exact word matches, I'd turn that on instead. I still would want it so the word order doesn't matter. That way "data user" and "user data" would both work. On a related topic from #128924/#128923, I found a concrete example where exclusions would be really helpful. See the below heavily redacted screenshot: |
@TylerLeonhardt very good point, I forgot about symbols: we use yet another fuzzy matcher implemented by Jo, which is the same thing used to filter down intelli-sense results: vscode/src/vs/base/common/filters.ts Line 546 in 77905c8
It is another LCS variant as far as I know. The one I wrote is heavily optimised for file paths, that is also why it is not used for symbols actually. And this explains some of the seemingly bad results from @ssigwart. To explain what is going on in that screenshot:
I think that latter behaviour is probably the explanation for many results that are less than ideal. But changing that rule will also mean that for any query you might see results appear in the top ranks where matches are only coming from parent folders, not the file itself. I think this is a matter of improving the ranking better to account for. |
Thanks for the explanation, @bpasero. It makes sense that a match on the filename is preferred over a directory. However, I feel like it would result in better results if the top results are always the ones that include an exact match of the separated words. Then the remainder would be sorted by the closeness to the full word (e.g.
Another thought I had is that maybe the length could be valuable for scoring too. So if you search for If you want, I'm willing to try to work on an update if given direction. |
My 2 cents. I agree the functionality should precisely match those of web search engines (e.g. google, bing, yahoo), all of which default to wrapping contiguous (exact) searches with "quotes". However... The scope of implementation and result set in this case is naturally more limited i.e. filename search, in a specific workspace, (probably) with Given this. While i approve of the change, it's more important to not break DevX (expectations / muscle memory). To achieve that, i request no matter what's done here with fuzzy search algorithmically, when opened with |
I filed #164352 but i can't tell if it dupes this ticket. I think it's more specifically about this issue:
Which means filenames with spaces are never exactly matched, and the files aren't highlighted properly (ignore the first two files in the below screenshot, they are recent matches) this screenshot shows the same thing, an exact file match is ignored: I don't understand the reasoning behind having two words start two separate matches. Maybe it would make more sense if the file list let you select multiple files to open at once. But you can only open one file, so starting a second search is counterintuitive. The intuitive searching behavior I'm used to: Always fuzzy matching, on the whole search (not broken up into searches by spaces), and sorted by Levenshtein distance (or similar) In other fuzzy search tools, if there ambiguous file name matches, I'm used to typing a letter or two from each subdirectory I know the file is in, so the fuzzy matcher will match that letter against the directory. Whatever Ctrl-P for Vim uses is so intuitive that I've never had an issue with the fuzzy matching nor really even thought about it, because it matches my mental model. VSCode's matching doesn't surface expected results, and I have to think about it a lot. Maybe instead of a toggle button, there could be a setting (or an API we can plug into to override it) to allow for more "natural" fuzzy matching? |
I recently switched from the IntelliJ world to VSCode and the fuzzy match algorithm of VSCode is slowly driving me insane. In addition to the cases here, one thing that IntelliJ gets really right is case sensitive matching of partial words and priority for local symbols. Check out this example. I am trying to match The result of this is that I frequently wind up with random variables and imports as a result of the correct result disappearing after I type more letters that should only increase its weight. |
@gpeal note that this issue is about the "quick open" picker component while the suggest widget uses a different implementation, so I think a separate issue would probably be warranted to improve suggest scoring. |
@bpasero What is the "quick open", specifically? |
The picker, see #131431 (comment) |
@bpasero I encounter the exact same issue with quick open. From what I can tell, they use the same fuzzy match algorithm. I just used a variable intellisense as an example but the same principle holds for quick open. |
Maybe similar algorithms (LCS) but entirely different implementations at least for file search vs suggest for symbols. |
FYI, these are the IntelliJ settings (and the default values are selected). Notably, it defaults to "first letter only" and also case sensitivity. Case sensitivity alone would go a long way to improve VSCode's algorithm here. People rarely capitalize a letter by mistake so matching a capital letter should have much higher weight than a lowercase letter in the middle of a word when a capital letter is typed. |
I've added the "quotes" support in #131292 and am planning on it releasing in stable as "experimental" because I'm not fully convinced it's the correct UI to do this. I want to try 2 other options:
?
menu.The text was updated successfully, but these errors were encountered: