Skip to content

Commit

Permalink
Merge branch 'main' into synthetic-source/single-element-arrays
Browse files Browse the repository at this point in the history
  • Loading branch information
kkrik-es committed Oct 8, 2024
2 parents e626324 + 10f6f25 commit f95c5dc
Show file tree
Hide file tree
Showing 189 changed files with 6,188 additions and 2,656 deletions.
5 changes: 5 additions & 0 deletions docs/changelog/111336.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 111336
summary: Use the same chunking configurations for models in the Elasticsearch service
area: Machine Learning
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/112933.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 112933
summary: "Allow incubating Panama Vector in simdvec, and add vectorized `ipByteBin`"
area: Search
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/113812.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 113812
summary: Add Streaming Inference spec
area: Machine Learning
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/114002.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 114002
summary: Add a `mustache.max_output_size_bytes` setting to limit the length of results from mustache scripts
area: Infra/Scripting
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/114080.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 114080
summary: Stream Cohere Completion
area: Machine Learning
type: enhancement
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/114177.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 114177
summary: "Make `randomInstantBetween` always return value in range [minInstant, `maxInstant]`"
area: Infra/Metrics
type: bug
issues: []
17 changes: 17 additions & 0 deletions docs/changelog/114231.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
pr: 114231
summary: Remove cluster state from `/_cluster/reroute` response
area: Allocation
type: breaking
issues:
- 88978
breaking:
title: Remove cluster state from `/_cluster/reroute` response
area: REST API
details: >-
The `POST /_cluster/reroute` API no longer returns the cluster state in its
response. The `?metric` query parameter to this API now has no effect and
its use will be forbidden in a future version.
impact: >-
Cease usage of the `?metric` query parameter when calling the
`POST /_cluster/reroute` API.
notable: false
4 changes: 2 additions & 2 deletions docs/reference/cluster/reroute.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Changes the allocation of shards in a cluster.
[[cluster-reroute-api-request]]
==== {api-request-title}

`POST /_cluster/reroute?metric=none`
`POST /_cluster/reroute`

[[cluster-reroute-api-prereqs]]
==== {api-prereq-title}
Expand Down Expand Up @@ -193,7 +193,7 @@ This is a short example of a simple reroute API call:

[source,console]
--------------------------------------------------
POST /_cluster/reroute?metric=none
POST /_cluster/reroute
{
"commands": [
{
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/commands/shard-tool.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ Changing allocation id V8QXk-QXSZinZMT-NvEq4w to tjm9Ve6uTBewVFAlfUMWjA
You should run the following command to allocate this shard:
POST /_cluster/reroute?metric=none
POST /_cluster/reroute
{
"commands" : [
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ PUT _connector/my-connector
"name": "My Connector",
"description": "My Connector to sync data to Elastic index from Google Drive",
"service_type": "google_drive",
"language": "english"
"language": "en"
}
----

Expand Down
40 changes: 24 additions & 16 deletions docs/reference/connector/docs/connectors-zoom.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -63,18 +63,22 @@ To connect to Zoom you need to https://developers.zoom.us/docs/internal-apps/s2s
6. Click on the "Create" button to create the app registration.
7. After the registration is complete, you will be redirected to the app's overview page. Take note of the "App Credentials" value, as you'll need it later.
8. Navigate to the "Scopes" section and click on the "Add Scopes" button.
9. The following scopes need to be added to the app.
9. The following granular scopes need to be added to the app.
+
[source,bash]
----
user:read:admin
meeting:read:admin
chat_channel:read:admin
recording:read:admin
chat_message:read:admin
report:read:admin
user:read:list_users:admin
meeting:read:list_meetings:admin
meeting:read:list_past_participants:admin
cloud_recording:read:list_user_recordings:admin
team_chat:read:list_user_channels:admin
team_chat:read:list_user_messages:admin
----
[NOTE]
====
The connector requires a minimum scope of `user:read:list_users:admin` to ingest data into Elasticsearch.
====
+
10. Click on the "Done" button to add the selected scopes to your app.
11. Navigate to the "Activation" section and input the necessary information to activate the app.
Expand Down Expand Up @@ -220,18 +224,22 @@ To connect to Zoom you need to https://developers.zoom.us/docs/internal-apps/s2s
6. Click on the "Create" button to create the app registration.
7. After the registration is complete, you will be redirected to the app's overview page. Take note of the "App Credentials" value, as you'll need it later.
8. Navigate to the "Scopes" section and click on the "Add Scopes" button.
9. The following scopes need to be added to the app.
9. The following granular scopes need to be added to the app.
+
[source,bash]
----
user:read:admin
meeting:read:admin
chat_channel:read:admin
recording:read:admin
chat_message:read:admin
report:read:admin
user:read:list_users:admin
meeting:read:list_meetings:admin
meeting:read:list_past_participants:admin
cloud_recording:read:list_user_recordings:admin
team_chat:read:list_user_channels:admin
team_chat:read:list_user_messages:admin
----
[NOTE]
====
The connector requires a minimum scope of `user:read:list_users:admin` to ingest data into Elasticsearch.
====
+
10. Click on the "Done" button to add the selected scopes to your app.
11. Navigate to the "Activation" section and input the necessary information to activate the app.
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/intro.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ For general content, you have the following options for adding data to {es} indi
If you're building a website or app, then you can call Elasticsearch APIs using an https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} client] in the programming language of your choice. If you use the Python client, then check out the `elasticsearch-labs` repo for various https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/search/python-examples[example notebooks].
* {kibana-ref}/connect-to-elasticsearch.html#upload-data-kibana[File upload]: Use the {kib} file uploader to index single files for one-off testing and exploration. The GUI guides you through setting up your index and field mappings.
* https://github.com/elastic/crawler[Web crawler]: Extract and index web page content into {es} documents.
* {enterprise-search-ref}/connectors.html[Connectors]: Sync data from various third-party data sources to create searchable, read-only replicas in {es}.
* <<es-connectors,Connectors>>: Sync data from various third-party data sources to create searchable, read-only replicas in {es}.

[discrete]
[[es-ingestion-overview-timestamped]]
Expand Down Expand Up @@ -492,4 +492,4 @@ and restrictions. You can review the following guides to learn how to tune your
* <<use-elasticsearch-for-time-series-data,Tune for time series data>>

Many {es} options come with different performance considerations and trade-offs. The best way to determine the
optimal configuration for your use case is through https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing[testing with your own data and queries].
optimal configuration for your use case is through https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing[testing with your own data and queries].
2 changes: 0 additions & 2 deletions docs/reference/mapping/runtime.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -821,8 +821,6 @@ address.
[[lookup-runtime-fields]]
==== Retrieve fields from related indices

experimental[]

The <<search-fields,`fields`>> parameter on the `_search` API can also be used to retrieve fields from
the related indices via runtime fields with a type of `lookup`.

Expand Down
89 changes: 89 additions & 0 deletions docs/reference/ml/trained-models/apis/infer-trained-model.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,17 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down Expand Up @@ -301,6 +312,17 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down Expand Up @@ -397,6 +419,21 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`span`::::
(Optional, integer)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down Expand Up @@ -517,6 +554,21 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`span`::::
(Optional, integer)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down Expand Up @@ -608,6 +660,17 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down Expand Up @@ -687,6 +750,21 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, integer)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`with_special_tokens`::::
(Optional, boolean)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`span`::::
(Optional, integer)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`with_special_tokens`::::
(Optional, boolean)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
Expand Down Expand Up @@ -790,6 +868,17 @@ include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenizatio
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]
=======
`deberta_v2`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-deberta-v2]
+
.Properties of deberta_v2
[%collapsible%open]
=======
`truncate`::::
(Optional, string)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate-deberta-v2]
=======
`roberta`::::
(Optional, object)
include::{es-ref-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
Expand Down
14 changes: 7 additions & 7 deletions docs/reference/reranking/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
[[re-ranking-overview]]
= Re-ranking

Many search systems are built on two-stage retrieval pipelines.
Many search systems are built on multi-stage retrieval pipelines.

The first stage uses cheap, fast algorithms to find a broad set of possible matches.
Earlier stages use cheap, fast algorithms to find a broad set of possible matches.

The second stage uses a more powerful model, often machine learning-based, to reorder the documents.
This second step is called re-ranking.
Later stages use more powerful models, often machine learning-based, to reorder the documents.
This step is called re-ranking.
Because the resource-intensive model is only applied to the smaller set of pre-filtered results, this approach returns more relevant results while still optimizing for search performance and computational costs.

{es} supports various ranking and re-ranking techniques to optimize search relevance and performance.
Expand All @@ -18,7 +18,7 @@ Because the resource-intensive model is only applied to the smaller set of pre-f

[float]
[[re-ranking-first-stage-pipeline]]
=== First stage: initial retrieval
=== Initial retrieval

[float]
[[re-ranking-ranking-overview-bm25]]
Expand All @@ -45,7 +45,7 @@ Hybrid search techniques combine results from full-text and vector search pipeli

[float]
[[re-ranking-overview-second-stage]]
=== Second stage: Re-ranking
=== Re-ranking

When using the following advanced re-ranking pipelines, first-stage retrieval mechanisms effectively generate a set of candidates.
These candidates are funneled into the re-ranker to perform more computationally expensive re-ranking tasks.
Expand All @@ -67,4 +67,4 @@ Learning To Rank involves training a machine learning model to build a ranking f
LTR is best suited for when you have ample training data and need highly customized relevance tuning.

include::semantic-reranking.asciidoc[]
include::learning-to-rank.asciidoc[]
include::learning-to-rank.asciidoc[]
9 changes: 5 additions & 4 deletions docs/reference/rest-api/common-parms.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1298,10 +1298,11 @@ tag::wait_for_active_shards[]
`wait_for_active_shards`::
+
--
(Optional, string) The number of shard copies that must be active before
proceeding with the operation. Set to `all` or any positive integer up
to the total number of shards in the index (`number_of_replicas+1`).
Default: 1, the primary shard.
(Optional, string) The number of copies of each shard that must be active
before proceeding with the operation. Set to `all` or any non-negative integer
up to the total number of copies of each shard in the index
(`number_of_replicas+1`). Defaults to `1`, meaning to wait just for each
primary shard to be active.

See <<index-wait-for-active-shards>>.
--
Expand Down
Loading

0 comments on commit f95c5dc

Please sign in to comment.