-
Notifications
You must be signed in to change notification settings - Fork 551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AND operator for joining terms #264
Comments
I'm also looking for similar functionality.. I like the |
I was able to achieve a hacky AND implementation by inspecting the metadata of the results before returning them back to the user: function search(query) {
var terms = query.split(' ');
return index.search(query).filter(function(result) {
return Object.keys(result.matchData.metadata).length == terms.length;
});
}
search('hello world'); This is a trimmed down version of the actual implementation I am using. You can always take into account whether or not one of terms is contained within the |
Would be really great to get this feature. Tried out elasticlunr. But this has no wildcards. |
Replying to @olivernn's comment here: #261 (comment)
I can imagine the query interface working something like this: var results = index.query(
function(q) {
q.term(mainTerm);
searchParams.forEach(function(param) {
var k = param[0];
var v = param[1];
q.term(v, { fields: [k] });
});
}, {
bool: 'AND'
}); This is similar to how elasticlunr.js defines boolean behavior between fields: http://elasticlunr.com/example/index.html So, now I'm digging into the |
See also the API for search-index: q.query = [ // Each array element is an OR condition
{
AND: {
'title': ['reagan'], // 'reagan' AND 'ussr'
'body': ['ussr']
},
NOT: {
'body': ['usa'] // but NOT 'usa' in the body field
}
},
{ // OR this condition
AND: {
'title': ['gorbachev'], // 'gorbachev' AND 'ussr'
'body': ['ussr']
},
NOT: {
'body': ['usa'] // NOT 'usa' in the body field
}
}
} Another examples from the above documentation: query: {
AND: {
'description': ['swiss', 'watch'],
'price': [{
gte: '1000',
lte: '8'
}]
}
} |
@olivernn is boolean search part of the roadmap? |
This is still something that I want to add, and will be the next large feature that I work on in Lunr, probably landing in 2.2, at some point. My current thinking is to implement this at a slightly lower level to begin with, and rather than implement I want to add a single property to a search term, lets call it "presence" for now. The default value of this property is Fitting this into the current API is simpler, e.g. for the search string I think it would be a prefix, similar to Lucene, e.g:
I think taking this approach means that we can punt on grouping of query terms for now, without restricting their implementation in the future. There is some interesting discussion on the Lucene wiki about some of confusion that can arise when using boolean operators - https://wiki.apache.org/lucene-java/BooleanQuerySyntax#Changing_Your_Mindset Thoughts? |
The workaround of @danjarvis is interesting, but as soon as one of the terms appears in multiple fields it "artificially" increases matchData.metadata.length. I fear there are no other workaround than issuing one query per term and then intersect the various result sets and finally average the scores. |
I have a working implementation of term presence queries, I haven't implemented the search string parser yet but it is supported in the programatic query interface. I'll try and put up a PR with the changes in the next couple of days if anyone is interested in doing some testing. |
For what is worth, here is my AND search workaround: _searchAnd(terms) {
var result_per_term = [];
terms.forEach((t) => {
result_per_term.push(lunr.query((q) => {
if (t.indexOf(':') > 0) {
var [key, val] = t.split(':');
q.term(val, { boost: 100, fields: ["fields" /*key*/] });
}
else if (t.indexOf('=') == 0) {
q.term(t.replace(/^=/, ''), { boost: 100, wildcard: 0 });
}
else if (stopwords.has(t)) {
return;
}
else {
q.term(t, { boost: 100 });
q.term(t, { boost: 10, usePipeline: true, wildcard: lunr.Query.wildcard.LEADING | lunr.Query.wildcard.TRAILING });
// q.term(t, { boost: 1, usePipeline: false, editDistance: 1 });
}
}));
});
// map doc id
var ids_per_term = result_per_term.map((e) => { return e.map(f => f.ref); });
// keep trace of terms not found
this.terms_not_found.splice(0);
for (const k in ids_per_term) if (ids_per_term[k].length == 0) this.terms_not_found.push(terms[k]);
// if a term is not found don't account for it (ignored from search query)
ids_per_term = ids_per_term.filter(n => n.length > 0);
var common_ids = new Set( _.intersection(...ids_per_term) );
var last_search = new Map();
for (const a of result_per_term) {
for (const result of a) {
if (! common_ids.has(result.ref)) continue;
if (last_search.has(result.ref)) {
var res = last_search.get(result.ref);
Object.assign(res.matchData.metadata, result.matchData.metadata);
res.score += result.score;
last_search.set(result.ref, res);
} else {
last_search.set(result.ref, result);
}
}
}
return Array.from(last_search.values());
} |
The
Seems like this would work with brackets for grouping as well:
I also like the suggestion in #310 for |
Any updates on this? would love to see this feature! |
Sorry for the lack of updates on this (and other) issues, my son was born at the end of last year so I haven't had a lot of free time recently 😅 I did push a branch with the changes to support this, its all there apart from the additions to the search string parser. So you can programatically construct queries with I'll put together a WIP PR with a bit more detail to help people try out what is there currently, getting some feedback would be good. Thanks for the patience on this feature too! |
I've just opened a PR with the changes to support term presence queries. There is an alpha release available on npm too (lunr@2.2.0-alpha.1). Please take a look and let me know any feedback or comments you have. If all goes well 2.2.0 stable will be released within the week. |
2.2.0 is now released, try it out, let me know if there are any issues! I'll update the guides shortly with some more examples including term presence. |
Term presence is great, but do we have any approach for this case: |
@olivernn, Is it possible to use '(A AND B) OR C' search case with current version? |
Has an approach like this been added ? |
Any workaround available for this ? |
Also interested if there are any workarounds for this! |
As was mentioned in #261 lunr.js currently doesn't support joining search terms with
AND
. It can has interface similar to what Elasticsearch has. You specify what is default operator (OR
orAND
) and the terms are joined with the specified operator. Otherwise it can look similar to this:Or with
.search()
it can look like this:Or more sophisticated with new modified query syntax:
What do you think?
The text was updated successfully, but these errors were encountered: