-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements the EXISTS
filter
#2484
Comments
For people following this issue, we have already published a docker tag to test the feature docker run -it --rm \
-p 7700:7700 \
getmeili/meilisearch:v0.29.0-filter.beta.0 Or you can compile the source code on the All the information about the new addition is detailed here Any feedback is more than welcome!! ❤️ |
556: Add EXISTS filter r=loiclec a=loiclec ## What does this PR do? Fixes issue [#2484](meilisearch/meilisearch#2484) in the meilisearch repo. It creates a `field EXISTS` filter which selects all documents containing the `field` key. For example, with the following documents: ```json [{ "id": 0, "colour": [] }, { "id": 1, "colour": ["blue", "green"] }, { "id": 2, "colour": 145238 }, { "id": 3, "colour": null }, { "id": 4, "colour": { "green": [] } }, { "id": 5, "colour": {} }, { "id": 6 }] ``` Then the filter `colour EXISTS` selects the ids `[0, 1, 2, 3, 4, 5]`. The filter `colour NOT EXISTS` selects `[6]`. ## Details There is a new database named `facet-id-exists-docids`. Its keys are field ids and its values are bitmaps of all the document ids where the corresponding field exists. To create this database, the indexing part of milli had to be adapted. The implementation there is basically copy/pasted from the code handling the `facet-id-f64-docids` database, with appropriate modifications in place. There was an issue involving the flattening of documents during (re)indexing. Previously, the following JSON: ```json { "id": 0, "colour": [], "size": {} } ``` would be flattened to: ```json { "id": 0 } ``` prior to being given to the extraction pipeline. This transformation would lose the information that is needed to populate the `facet-id-exists-docids` database. Therefore, I have also changed the implementation of the `flatten-serde-json` crate. Now, as it traverses the Json, it keeps track of which key was encountered. Then, at the end, if a previously encountered key is not present in the flattened object, it adds that key to the object with an empty array as value. For example: ```json { "id": 0, "colour": { "green": [], "blue": 1 }, "size": {} } ``` becomes ```json { "id": 0, "colour": [], "colour.green": [], "colour.blue": 1, "size": [] } ``` Co-authored-by: Kerollmops <clement@meilisearch.com>
I am curious, if my field got value Will there be any support in the works or we need to put a value like |
Hey @mech,
What you're asking for would be another feature we've already thought about. It's not planned currently, but I'm going to open a discussion on our product repository, and it would be nice if you could answer me over there so we don't lose any information! |
After the following discussion and a meeting with @gmourier and @loiclec; https://github.com/meilisearch/product/issues/22
The first version of the filter should be implemented with the following syntax;
With the following set of documents;
price EXISTS
will select the first document.price NOT EXISTS
orNOT price EXISTS
will select only the second document.If a field contains an empty array or a null value, it's considered as existing:
Here
price EXISTS
matches documents 1, 2 and 3.This will ease the handling of incomplete documents. For example, if you want to return all
T-shirt
that cost less than 20€ or that doesn't have a price specified, you will be able to writeproduct = "T-shirt" AND (price < 20 OR price NOT EXISTS)
.That was not possible previously.
The text was updated successfully, but these errors were encountered: