From fe51d146f7fdca067162e59f1caf41d5fcd2c518 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar?= Date: Mon, 2 May 2022 19:37:42 +0200 Subject: [PATCH 1/7] Improve formatted spec --- text/0118-search-api.md | 198 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 183 insertions(+), 15 deletions(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index 4624d5c2..64f3def9 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -197,11 +197,11 @@ Sets the starting point in the search results, effectively skipping over a given - Type: Array of String (POST) | String (GET) - Required: False -- Default: `[]|null` +- Default: `["*"]`, meaning all the attributes Configures which attributes will be retrieved in the returned documents. -If no value is specified, `attributesToRetrieve` uses the `displayedAttributes` index setting, which by default contains all attributes found in the documents. +If no value is specified, the default value of `attributesToRetrieve` is used (`["*"]`). This corresponds to the `displayedAttributes` index setting, which by default contains all attributes found in the documents. > If an attribute is missing from `displayedAttributes` index setting, `attributesToRetrieve` silently ignore it, and the field doesn't appear in the returned search results. @@ -215,9 +215,9 @@ If no value is specified, `attributesToRetrieve` uses the `displayedAttributes` Configures which fields may have highlighted parts, given that they match the requested query terms (i.e. the terms in the [`q`](#311-q) search parameter). Pre/post highlighting tags are applied around each word corresponding to a query term. -Search results include a `_formatted` object containing the highlighted parts when this parameter is defined. See [3.2.1.1.2. `_formatted`](#32112-formatted) section. +If `attributesToHighlight` is present in the search query, the search results will include a `_formatted` object containing the attributes and their highlighted parts. For more detailed regarding the `_formatted` behavior, see the [3.2.1.1.2. `_formatted`](#32112-formatted) section. -If `"*"` is provided as a value: `attributesToHighlight=["*"]` all the attributes present in `displayedAttributes` setting will be automatically assigned to `_formatted`. +If `"*"` is provided as a value (`attributesToHighlight=["*"]`), all the attributes present in `displayedAttributes` setting will be highlighted. Highlighted parts are surrounded by the [`highlightPreTag`](#319-highlightpretag) and [`highlightPostTag`](#3110-highlightposttag) parameters. @@ -275,16 +275,16 @@ This parameter is applied to the fields from `attributesToHighlight`. If there a Defines document attributes to be cropped. Cropped attributes have their values shortened around query terms. +If `attributesToCrop` is present in the search query, the search results will include a `_formatted` object containing the attributes and their cropped parts. For more detailed regarding the `_formatted` behavior, see the [3.2.1.1.2. `_formatted`](#32112-formatted) section. + +If `"*"` is provided as a value (`attributesToCrop=["*"]`), all the attributes present in `displayedAttributes` setting will be cropped. + The number of words contained in the cropped value is defined by the `cropLength` parameter. See [3.1.1.12. `cropLength`](#3112-croplength) section. The value of `cropLength` can be customized per attribute. See [3.1.12.1. Custom `cropLength` Defined Per Cropped Attribute](#31121-custom-croplength-defined-per-attribute) section. The engine adds a marker by default in front of and/or behind the part selected by the cropper. This marker is customizable. See [3.1.1.13. `cropMarker`](#31113-cropmarker) section. -Search results include a `_formatted` object containing the cropped attributes representation when this parameter is defined. See [3.2.1.1.2. `_formatted`](#32112-formatted) section. - -If `"*"` is provided as a value: `attributesToCrop=["*"]` all the attributes present in the `displayedAttributes` setting will be automatically assigned to `_formatted`. - - 🔴 Sending a value with a different type than `Array[String]`(POST), `String`(GET) or `null` for `attributesToCrop` returns a [bad_request](0061-error-format-and-definitions.md#bad_request) error. ##### 3.1.11.2. searchableAttributes @@ -460,14 +460,182 @@ Search queries using `_geoPoint` returns a `_geoDistance` field containing the d - Type: Object - Required: False -`_formatted` returns highlighted and cropped attributes specified in `attributesToHighlight` and/or `attributesToCrop` of a search result. +`_formatted` is an object returned in the search response, only if at least one of the following paramaters has been set in the search query: +- `attributesToHighlight` +- `attributesToCrop` + +If `attributesToHighlight` and `attributesToCrop` are not set, `_formatted` is not returned. + +This `_formatted` object will be present in the each returned document in the `hits` field. + +Example: + +```json +{ + "q": "", + "attributesToCrop": ["title"] +} +``` + +```json +{ + "hits": [ + { + "id": 2, + "title": "Pride and Prejudice", + "_formatted": { + "id": "2", + "title": "Pride and Prejudice" + } + }, + { + "id": 456, + "title": "Le Petit Prince", + "_formatted": { + "id": "456", + "title": "Le Petit Prince", + } + } + ], + ... +} +``` + +Which attributes are present in `_formatted`? + +The `_formatted` object will contain attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **the attributes present in `_formatted` is the addition of the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop`**. + +Kmowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. + +Returning attributes in the `_formatted` object does not mean these attributes will be necessarily highlighted or cropped, see the next point. -- If `attributesToHighlight` and `attributesToCrop` are not set, `_formatted` is not returned. +Which attributes are highlighted or cropped in `_formatted`? + +No matter how many attributes are retrieved in `_formatted` following the previous rule: +- Only the attributes present in `attributesToHighlight` are highlighted. +- Only the attributes present in `attributesToCrop` are cropped. +- Attributes present in both are cropped and highlighted at the same time. + +Some edge cases: - If cumulated fields in `attributesToHighlight` and `attributesToCrop` resolve to only having non-existent fields, `_formatted` is not returned. -- If `attributesToRetrieve` is equal to `*` and `attributesToHighlight` or `attributesToCrop` are equals to `*`, `_formatted` is returned and contains `displayedAttributes` setting fields then compute highlights and crops on each received fields. -- If `attributesToRetrieve` is equal to `*` and `attributesToHighlight` or `attributesToCrop` contains a set of fields, `_formatted` is returned and contains `displayedAttributes` setting fields but only compute highlights and crops on fields declared in `attributesToHighlight` or `attributesToCrop`. -- If a list of fields is defined for `attributesToRetrieve` and `attributesToHighlight` / `attributesToCrop` are equals to `*`, `_formatted` is returned and contains `displayedAttributes` setting fields then compute highlights and crops on each received fields. -- If a list of fields is defined for `attributesToRetrieve` and `attributesToHighlight` / `attributesToCrop` contains a list of fields, `_formatted` is returned and contains `attributesToRetrieve` fields, plus the fields set in `attributesToHighlight` or `attributesToCrop` then compute highlights and crops only for fields defined in `attributesToHighlight` / `attributesToCrop` parameters. + +Some examples: +*The examples work the same with `attributesToCrop`* + +Example 1: + +```json +{ + "q": "t", + "attributesToHighlight": ["title"] +} +``` + +```json +{ + "hits": [ + { + "id": 1, + "title": "The Hobbit", + "author": "J. R. R. Tolkien", + "_formatted": { + "id": "1", + "title": "The Hobbit", + "author": "J. R. R. Tolkien" + } + } + ], + ... +} +``` +-> All the attributes (so `id`, `title` and `author`) are returned in `_formatted` because by default `attributesToRetrieve` is set to `["*"]`. +-> Only `title` is highlighted. + +Example 2: + +```json +{ + "q": "t", + "attributesToHighlight": ["*"] +} +``` + +```json +{ + "hits": [ + { + "id": 1, + "title": "The Hobbit", + "author": "J. R. R. Tolkien", + "_formatted": { + "id": "1", + "title": "The Hobbit", + "author": "J. R. R. Tolkien" + } + } + ], + ... +} +``` +-> `id`, `title` and `author` are returned in `_formatted` because`attributesToHighlight` is set to `["*"]` (but also `attributesToRetrieve` by default). +-> Both `title` and `author` are highlighted because `attributesToHighlight` is set to `["*"]`. + +Example 3: + +```json +{ + "q": "t", + "attributesToRetrieve": ["author"], + "attributesToHighlight": ["title"] +} +``` + +```json +{ + "hits": [ + { + "author": "J. R. R. Tolkien", + "_formatted": { + "title": "The Hobbit", + "author": "J. R. R. Tolkien" + } + } + ], + ... +} +``` +-> Only `author` is returned at the root of the document because defined in the `attributesToRetrieve`. +-> Only `author` and `title` are returned in `_formatted` because the addition of `attributesToRetrieve` and `attributesToHighlight`. +-> Only `title` is highlighted because the only one defined in `attributesToHighlight`. + +Example 4: + +```json +{ + "q": "t", + "attributesToRetrieve": [], + "attributesToHighlight": ["*"] +} +``` + +```json +{ + "hits": [ + { + "_formatted": { + "id": "1", + "title": "The Hobbit", + "author": "J. R. R. Tolkien" + } + } + ], + ... +} +``` +-> No attributes are returned at the root of the document because `attributesToRetrieve` is set to `[]`. +-> All the attributes are returned in `_formatted` because `attributesToHighlight` is set to `["*"]`. +-> All the attributes are highlighted because `attributesToHighlight` is set to `["*"]`. + ###### 3.2.1.1.3. `_matchesInfo` @@ -566,4 +734,4 @@ n/a - Move `attributesToHighlight`, `highlightPreTag`, `highlightPostTag`, `attributesToCrop`, `cropLength` and `cropMarker` into a `formatter` objet. - Add an option to only highlight complete query term. - Expose the `formatter` resource as an index setting. -- Highlight a phrase search as a single highlighted section. \ No newline at end of file +- Highlight a phrase search as a single highlighted section. From 3387bfc14ed13c37c856f9887b13f285ec7830d2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar=20-=20curqui?= Date: Tue, 3 May 2022 10:21:30 +0200 Subject: [PATCH 2/7] Update text/0118-search-api.md Co-authored-by: Tamo --- text/0118-search-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index 64f3def9..9351a6b2 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -466,7 +466,7 @@ Search queries using `_geoPoint` returns a `_geoDistance` field containing the d If `attributesToHighlight` and `attributesToCrop` are not set, `_formatted` is not returned. -This `_formatted` object will be present in the each returned document in the `hits` field. +This `_formatted` object will be present in each returned document in the `hits` field. Example: From af4ef54891ec651956447b90ce6e827b611740b6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar=20-=20curqui?= Date: Tue, 3 May 2022 10:22:34 +0200 Subject: [PATCH 3/7] Update text/0118-search-api.md Co-authored-by: Tamo --- text/0118-search-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index 9351a6b2..00886ed6 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -503,7 +503,7 @@ Example: Which attributes are present in `_formatted`? -The `_formatted` object will contain attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **the attributes present in `_formatted` is the addition of the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop`**. +The `_formatted` object will contain attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **the attributes present in `_formatted` are the addition of the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop`**. Kmowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. From e2d0676883f6c93d8666a981659a9c750d9a2062 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar=20-=20curqui?= Date: Tue, 3 May 2022 13:58:49 +0200 Subject: [PATCH 4/7] Update text/0118-search-api.md Co-authored-by: Tamo --- text/0118-search-api.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index 00886ed6..510e260f 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -472,7 +472,6 @@ Example: ```json { - "q": "", "attributesToCrop": ["title"] } ``` From 2b42b6a425029e1e31a95d154005f21136423b00 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar=20-=20curqui?= Date: Tue, 3 May 2022 14:00:06 +0200 Subject: [PATCH 5/7] Update text/0118-search-api.md Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> --- text/0118-search-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index 510e260f..a9219be8 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -502,7 +502,7 @@ Example: Which attributes are present in `_formatted`? -The `_formatted` object will contain attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **the attributes present in `_formatted` are the addition of the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop`**. +The `_formatted` object contains attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **`_formatted` contains all the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop` combined**. Kmowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. From 3cd78b8dfd3b9343b41de39ce5fa405ff89ffeae Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar=20-=20curqui?= Date: Tue, 3 May 2022 14:00:12 +0200 Subject: [PATCH 6/7] Update text/0118-search-api.md Co-authored-by: cvermand <33010418+bidoubiwa@users.noreply.github.com> --- text/0118-search-api.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index a9219be8..a76e23fe 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -504,7 +504,7 @@ Which attributes are present in `_formatted`? The `_formatted` object contains attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **`_formatted` contains all the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop` combined**. -Kmowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. +Knowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. Returning attributes in the `_formatted` object does not mean these attributes will be necessarily highlighted or cropped, see the next point. From 8cd797b29c05bc5d51445d34e4bc94fdc9434e47 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Cl=C3=A9mentine=20Urquizar?= Date: Tue, 3 May 2022 17:33:13 +0200 Subject: [PATCH 7/7] Improve according to reviews --- text/0118-search-api.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0118-search-api.md b/text/0118-search-api.md index a76e23fe..ba8f4361 100644 --- a/text/0118-search-api.md +++ b/text/0118-search-api.md @@ -502,6 +502,8 @@ Example: Which attributes are present in `_formatted`? +*Remember the main rule: `_formatted` is only present if `attributesToHighlight` or `attributesToCrop` is set.* + The `_formatted` object contains attributes coming from the original document, depending on the parameters the users set during the search query. Indeed, **`_formatted` contains all the attributes present in `attributesToRetrieve`, `attributesToHighlight`, and `attributesToCrop` combined**. Knowing the default value of `attributesToRetrieve` is `["*"]` (so all the attributes present in `displayedAttributes`), if no `attributesToRetrieve` are set in the search query, `_formatted` will return all the `displayedAttributes`. @@ -510,7 +512,7 @@ Returning attributes in the `_formatted` object does not mean these attributes w Which attributes are highlighted or cropped in `_formatted`? -No matter how many attributes are retrieved in `_formatted` following the previous rule: +No matter which attributes are retrieved in `_formatted` (according to the previous section "Which attributes are present in `_formatted`?"): - Only the attributes present in `attributesToHighlight` are highlighted. - Only the attributes present in `attributesToCrop` are cropped. - Attributes present in both are cropped and highlighted at the same time.