Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contributable SearchProvider #32549

Merged
merged 7 commits into from
Aug 28, 2017
Merged

Contributable SearchProvider #32549

merged 7 commits into from
Aug 28, 2017

Conversation

jrieken
Copy link
Member

@jrieken jrieken commented Aug 15, 2017

This PR contains an experiment to apply the "provider-pattern" to our search logic (find in and for files). This is part of the larger remote development scenario.

[edited with details and more ideas]

The API should cover two purposed, searching for and in files and IMO there should be two APIs/providers for this. Let's gather some requirements:

Search For Files

Ways to express what you are looking for

  • simple patterns ala *.png
  • glob patterns ala {**/*.test.ts,src/**/*.js}
  • regular expressions
  • subsequent character matches, like foa matches i Find characters in the Order in which they Appear in the query
  • string substring match, like foa only matches foa the people

FYI - today the extension API already allows to findFiles and it allows to define queries using glob-patterns;

Results

I think it's fair to define the result as a stream of Uris. They are easy to create and VS Code handles them well. Questions might be

  • Sorting/Ranking? Each provider might be smart and return results by it's own sort order. However, that is always local and it's not sure how they compare against each other.
  • Upper bounds? In VS Code we usually don't return more then 1000 results

Search In Files

Ways to express what you are looking for

  • regular expressions
  • substring match

Unsure if we need to define the exact regex feature set or if this should be more pragmatic and just be a best effort approach where each provider can decide how to interpret a query

Results

  • We need a way to preview them in context. Today we use the full line of a match but that doesn't work well with very long lines. Maybe we spec the preview to be at least the matching range but ideally plus/minus N characters
  • Offsets or position bases? A search match inside a file must denote the location in that file. Our API always talks about positions (line and character) and not offsets. It might be a challenge for some providers to provide that information and we should be fit for that, e.g allow to fill in that information later.

@jrieken jrieken self-assigned this Aug 15, 2017
@jrieken jrieken added this to the August 2017 milestone Aug 15, 2017
@jrieken jrieken added the engineering VS Code - Build / issue tracking / etc. label Aug 15, 2017
@jrieken
Copy link
Member Author

jrieken commented Aug 15, 2017

@bolinfest Trying to pick your brain on this, esp. on how your would a search query like to be. Today, in VS Code queries are using regular expressions, those from JS and preferably those from ripgrep/rust. Is that an option for your index-based search solution or do need something more conservative?

@dlebu
Copy link

dlebu commented Aug 22, 2017

@jrieken The API proposal looks fine for our scenarios. Here's our feedback regarding the outstanding questions:

Search for files:

  • Regarding sorting/ranking the results, we don't have a strong preference towards how they should be sorted. It can be either by basename, URI, or just as simple as making a call and always ranking local higher than the rest.
  • Upper bound of 1000 is more than enough.

Search in files:

  • We like the best effort proposal where each provider decides how to interpret a query. If the IDE needs to know what type of queries are supported, we could introduce the concept of capabilities.
    Ideally, search providers should offer the full-set or a sub-set of the VSCode regex feature set.
  • For search results, we might end up using ripgrep as well, so whatever they return for preview and offset/position bases is fine.

A couple of questions:

  • Would it make sense to also add a corresponding workspace.searchInFiles method for extensions to consume in addition to the search provider?
  • Will the replace command in the IDE eventually trigger a writeContents in the file system provider?

CC @lostintangent

@jrieken
Copy link
Member Author

jrieken commented Aug 28, 2017

Would it make sense to also add a corresponding workspace.searchInFiles method for extensions to consume in addition to the search provider?

Yeah, we track that in #9049. Once we know how queries and results look we can tackle this

Will the replace command in the IDE eventually trigger a writeContents in the file system provider?

Yep, that's the goal

@jrieken jrieken merged commit ccef28b into master Aug 28, 2017
@jrieken jrieken deleted the joh/searchp branch August 28, 2017 15:21
@bolinfest
Copy link
Contributor

@jrieken Apologies, just saw this.

In general, I would favor giving flexibility to the provider. For example, VS Code should not do the sorting because we are interested in playing with "personalized search" where the files are ranked according to how much you have interacted with them rather than an alphabetical sort, so we wouldn't want VS Code to override the order we provide.

For providers like ours where we expect to do the sorting, it doesn't make a difference to us whether we can stream results or not because we have to aggregate all of the results before we do the sort, anyway. I could certainly imagine that this is not the case for all providers, though.

In terms of "Ways to express what you are looking for," how would VS Code expose this in its UI? For example, the simplest thing is for us to register two providers: one that interprets the query as a regex and one that interprets it as plaintext. Then it is up to the user to choose the appropriate provider.

However, if there were a structured way to express these options and then VS Code presented them to the user (for example, as a checkbox in the regex vs. plaintext example), then this becomes more interesting.

I think I'm missing the UI piece in this proposal: what were you envisioning?

@jrieken
Copy link
Member Author

jrieken commented Aug 29, 2017

I think I'm missing the UI piece in this proposal: what were you envisioning?

We wanna stick to our 'data provider'-pattern with this, so no UI proposal but the plan is to use the existing UI and merge results from different providers. In return, that means our current UI isn't set in stone and might have to change for new requirements/scenarios.

Today, the "find in files"-UI has query-toggles for 'whole word', 'case-sensitive', and 'regex'-search. It also has input fields for include and exclude (glob) patterns.

screen shot 2017-08-29 at 15 16 53

The UI for "find files" is more simple and just the quick pick box as shown below. However, some project excludes also apply, e.g. don't search in node_modules, don't find generated JS files but their TS files etc...

screen shot 2017-08-29 at 15 20 58

Ideally search result providers support those things too and ideally we can treat all results equal and merge them. That also touches on sorting/ranking. Since one provider has a local view of the world it cannot tell how its results should compare against results from other providers. That's why usually we sort and rank things, e.g completion items or reference search results.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
engineering VS Code - Build / issue tracking / etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants