Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a variety constraint to querying and hit collection #500

Closed
apatrida opened this issue Nov 9, 2010 · 2 comments
Closed

Add a variety constraint to querying and hit collection #500

apatrida opened this issue Nov 9, 2010 · 2 comments

Comments

@apatrida
Copy link
Contributor

apatrida commented Nov 9, 2010

Ok, this is a new one as not sure I've seen it in another product but it is a good alternative to just pure scoring and pushing docs to the top.

When searching, I should be able to ask that I get back a "variety" of results by type, or a field value and a min/max count of each. So for example, I get the best ranked results but I want 5 books, 5 pictures, 5 songs, and 5 movies total rather than 15 books, 5 songs eating up my first page. NOTE: I am not asking that they be grouped together, but that there is a limit to how many of each can make that response to avoiding flooding out the other docs of various types (this is a perfect job for a VarietyConstrainedHitCollector which I'm sure doesn't exist).

Basically a variety constraint on what is returned.

You can implement this client-side but it would be more efficient down in the server. And doing multiple queries by types is wasteful as well and doesn't allow them to be scored together by relevancy.

I'm not sure yet what to do about paging or scrolling or even if those would be supported. This is more of a first-page type of feature, but if you CAN solve paging and scrolling it would be interesting. I'm thinking on those more now but recording this before I forget.

It's also similar to some searches we built in the past that show a summary page of the first 10 and last 10 of the sort order (show me the least and most expensive of matching items) which is a similar variety constraint and more efficiently done in the engine while it has things in memory; rather than doing follow-on queries.

@karussell
Copy link

Duplicate of #256 !?

Solved via field collapsing / results grouping in Solr ... but not sure if for the distributed case too ...

@clintongormley
Copy link
Contributor

Closed by #6124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants