-
Notifications
You must be signed in to change notification settings - Fork 11
Settings API - Customize the hard limits for pagination
and faceting
#157
Conversation
🤖 API change detected: Added (6)
Modified (2)
|
pagination
and faceting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
550: Add the two new pagination and faceting settings r=ManyTheFish a=Kerollmops This PR adds two new settings in the database, those settings are described [in this spec](meilisearch/specifications#157). Co-authored-by: Kerollmops <clement@meilisearch.com>
What do you think of It goes in line with
Makes me think we can customize this value for every facet. Like If this is not the case, what about something more generic like: ? |
I have doubts about the name Usually when we speak about pagination, we speak about how results are sent to and displayed on the front end—particularly when the total number of results is broken down into smaller pages. This setting on the other hand affects only the total number of results returned by the engine. So it doesn't have anything to do with pagination as far as I can tell. Won't it create confusion? Edit: Just wanted to say I totally understand if the |
Strong agree with
For me, the name |
2494: Introduce the new faceting and pagination settings r=ManyTheFish a=Kerollmops This PR introduces two new settings following the newly created spec meilisearch/specifications#157: - The `faceting.max_values_per_facet` one describes the maximum number of values (each with a count) associated with a value in a facet distribution query. - The `pagination.limited_to` one describes the maximum number of documents that a search query can ever return. Co-authored-by: Kerollmops <clement@meilisearch.com>
Thanks for your comments on the names! Do you have a better name than |
Hi @gmourier , if you know that you will add more pagination-related settings in the future and wish to keep this one within that category, then I completely understand and support the naming decision. On the other hand, I think this setting can be useful and interesting to people besides those who need it for pagination. One example that you've already brought up is security, to prevent scraping of the database. There may be other cases we haven't considered where users want to enforce a base I think this setting would be easier to find as an int at the root level of the settings resource with a name like |
Poll time! Limit the number of hits on which we can paginateLittle context, currently it would be inside a {
pagination: {
limitedTo: 100
}
} This might change depending on another poll.
See previous comments for arguments. |
Maximum values returned by each facetLittle context, the setting is inside a {
faceting: {
maxValuesPerFacet: 100
}
}
See previous comments for arguments. |
Pagination object for nested limit on hitsCurrently, the Example: {
// other settings
pagination: {
limitedTo: 100
}
} The question is, do we want to call it
See previous comments for arguments. |
Co-authored-by: Clément Renault <renault.cle@gmail.com>
Co-authored-by: Clément Renault <renault.cle@gmail.com>
fb8b067
to
a1cac76
Compare
…ng` (#157) * Introduces specification files * Update files name * branch telemetry * Update open-api.yml * Update text/0034-telemetry-policies.md Co-authored-by: Clément Renault <renault.cle@gmail.com> * update open-api.yml * Update text/157-faceting-setting-api.md Co-authored-by: Clément Renault <renault.cle@gmail.com> * Rename limitedTo to maxTotalHits * Specify order of returned facet Co-authored-by: Clément Renault <renault.cle@gmail.com>
* Bump open-api.yml to v0.28 * Telemetry - Add `x-meilisearch-client` query parameter (#145) * Introduce the x-meilisearch-client query parameter * Update text/0034-telemetry-policies.md Co-authored-by: Bruno Casali <brunoocasali@gmail.com> * Update text/0034-telemetry-policies.md Co-authored-by: Bruno Casali <brunoocasali@gmail.com> Co-authored-by: Bruno Casali <brunoocasali@gmail.com> * GeoSearch — Support string type for `_geo` `lat` and `lng` fields (#83) * Update specification to support string type for _geo lat and lng fields * mention types mixing for _geo object * Tasks API - Rename `uid` to `taskUid` in the `202 - Accepted` Summarized Task Response (#144) * Rename 202 uid to taskUid * Update text/0060-tasks-api.md Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> * Tasks API - Seek/Keyset based pagination (#115) * Move cursor based pagination spec to tasks API spec * remove pagination as a future capability * Clarify boundaries for limit query parameter * Update OpenApi specification * Remove limit field boundaries * Apply suggestions from code review Co-authored-by: Clément Renault <renault.cle@gmail.com> * Update open-api.yml and removes cursor term mentions * Update text/0060-tasks-api.md Co-authored-by: Tommy <68053732+dichotommy@users.noreply.github.com> * Remove route mention in seek-keyset pagination section Co-authored-by: Clément Renault <renault.cle@gmail.com> Co-authored-by: Tommy <68053732+dichotommy@users.noreply.github.com> * Tasks API - Filter tasks list by `type`/`status`/`indexUid` (#116) * move filtering tasks by status/type parameter to task api spec * Update specification * Add details about case-sensitivy + rework error message * Introducing naming changes plus make the specification a source of truth instead of a changelog * Remove a future possibility being introduced * misc - replace createIndex to the right type and add the missing type field to the 202 Response resource * Dumps API - Make dump creation an asynchronous task (#139) * wip * Make a dump creation a visible asynchronous task * Add precisions * Update open-api.yml * Add ommited type field for summarized task response * Add future possibilities * Apply suggestions from code review Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> * Precise that indexUid can be null * Precise priorization of dumpCreation task over other task types * Keep taskUid for 202 response * remove dumps.get API key action Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> * Search API - Remove/Rename confusing fields (#135) * Rename nbHits, remove exhaustive* boolean fields * Rename approximativeNbHits to estimatedTotalHits * Update open-api.yaml * Apply naming changes for facet distribution and showing matches position * Add a telemetry for facet distribution usage * API Guideline - Return list of API resources under a `results` array (#138) * Place list of documents under a results array on /documents * Add results array for indexes object list * Add the future of indexes pagination * Update open-api.yml * Fix typo * Apply suggestions from code review Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> * Add offset/limit pagination for indexes and API keys * Try to add multipe refs to a response object Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> * Remove name field (#140) * Documents API - `displayedAttributes` should not impact the documents API / Rename `attributesToRetrieve` to `fields` (#143) * Specifies that displayedAttributes setting does not impact the GET documents endpoint * Rename attributeToRetrieve to fields on /documents * Add a future possibily to rejectt a field from a document in the given response * Precise behavior details about fields query parameter * Add fields query parameter on GET /indexes/{index}/documents/{docId} * API Keys - Determinist API Keys + Security changes (#148) * Add an uid to make API Keys determinists, plus add a non-unique human readable name field to ease reading information * Describe errors for uid and name fields * Apply suggestions from code review Co-authored-by: Bruno Casali <brunoocasali@gmail.com> * misc: add precisions * Reorganize route descriptions * Update error_code when API Key already exists for a given uid * Apply suggestions from code review Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> * Add new keys actions, remove master-key changes, introduce a new error for immutable field and update tenant token * Update open-api spec * Update immutable_field error message * Apply suggestions from code review Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> * Mention that the Default Admin API Key can manage keys * Specify that the JWT Tenant Token must be enrypted with the API Key value * Update the spec regarding the description of the Admin API Key to be up-to-date * Add uid_or_key url param to update and delete a key * Update text/0085-api-keys.md Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> * Update text/0085-api-keys.md Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> Co-authored-by: Bruno Casali <brunoocasali@gmail.com> Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> Co-authored-by: Kerollmops <clement@meilisearch.com> * Geosearch - Enhance lat/lng format error messages (#149) * Update the geosearch error message * misc: organiser error message in the right specification Co-authored-by: Guillaume Mourier <guillaume@meilisearch.com> * Introduces HTTP Verbs changesto be compliant regarding a Rest API (#152) * Telemetry - Replace `x-meilisearch-client` query parameter by `X-Meilisearch-Client` header (#150) * Removes x-meilisearch-client, replace it by a header * Remove capslock * fix typo (#151) * Mention telemetry.meilisearch.com (#153) * update ranking rules error message (#154) * Misc — Update dump versions compatibility table (#156) * Update dump table * Update text/0105-dumps-api.md * Settings API - Customize the hard limits for `pagination` and `faceting` (#157) * Introduces specification files * Update files name * branch telemetry * Update open-api.yml * Update text/0034-telemetry-policies.md Co-authored-by: Clément Renault <renault.cle@gmail.com> * update open-api.yml * Update text/157-faceting-setting-api.md Co-authored-by: Clément Renault <renault.cle@gmail.com> * Rename limitedTo to maxTotalHits * Specify order of returned facet Co-authored-by: Clément Renault <renault.cle@gmail.com> * Add dumpCreation task type to OpenAPI.yml * Tasks filtering params to be in query instead of path on OpenAPI spec Co-authored-by: Bruno Casali <brunoocasali@gmail.com> Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> Co-authored-by: Clément Renault <renault.cle@gmail.com> Co-authored-by: Tommy <68053732+dichotommy@users.noreply.github.com> Co-authored-by: Many the fish <legendre.maxime.isn@gmail.com> Co-authored-by: Kerollmops <clement@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: ad hoc <postma.marin@protonmail.com> Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
🤖 API Diff
Summary
This pull request introduces two new settings API to customize the current hard limits that are blocking users to adjust the search engine behavior to their custom needs.
pagination
settings APIBefore v0.27, we didn't have a limit on the number of reachable documents for a query. A handful of users complained that this made their database scrappable or subject to DDoS attack, so we decided to introduce a limit of
1000
for v0.27, non-customizable.Our hypothesis (Some users might need to customize it at a superior range) proved to be true, some users need to customize this limit.
The main argument is that some users need to make the search results scrappable by search engines for their SEO.
Some users also simply want to go further in the pagination despite the relevancy of the first pages.
No one has yet asked to customize it to a lower value, but we imagine this is possible, especially for those who want to reduce the “scrappability” of their dataset.
faceting
settings APISince v0.21, there is no more limit on the number of values returned per facet. This was the case before v0.21 but some users found the default value too low, so it was decided to remove this limit.
The fact that it is unlimited today poses performance problems for users who have many possible values for a facet.
Rather than blocking one or the other, we make this value customizable according to their needs.
Important
Increasing the default values of these properties can degrade performance and more easily expose an instance to DDoS attacks or scrapping by a malicious user.
The default values should be a good compromise between performance and usage for most users.
Changes
API Endpoints
pagination
settings API resource.GET/PATCH/DELETE
-/indexes/:indexName/settings/pagination
maxTotalHits
property - Type: Integer - Default to1000
faceting
settings API resource.GET/PATCH/DELETE
-/indexes/:indexName/settings/faceting
maxValuesPerFacet
property - Type: Integer - Default to100
pagination
andfaceting
objects are accessible from the global settings endpoints/indexes/:indexName/settings
Telemetry Datapoints
pagination.max_total_hits
andfaceting.max_values_per_facet
are as a property of theSettings Updated
event.Pagination Updated
event is introduced with themax_total_hits
property.Faceting Updated
event is introduced with themax_values_per_facet
propertyOut Of Scope
n/a
Attention To Reviewers
Introduced fields like
maxTotalHits
andmaxValuesPerFacet
are open to suggestion.Misc