From 48a39774ed61de3b5e22c03907eeb3cecdf32823 Mon Sep 17 00:00:00 2001 From: David Turner Date: Wed, 18 Jul 2018 09:40:55 +0100 Subject: [PATCH] Improve docs for search preferences (#32098) Today it is unclear what guarantees are offered by the search preference feature, and we claim a guarantee that is stronger than what we really offer: > A custom value will be used to guarantee that the same shards will be used > for the same custom value. This commit clarifies this documentation and explains more clearly why `_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`. Relates #31929 #26335 #26791. --- .../search/request/preference.asciidoc | 106 ++++++++++++------ 1 file changed, 73 insertions(+), 33 deletions(-) diff --git a/docs/reference/search/request/preference.asciidoc b/docs/reference/search/request/preference.asciidoc index 325730f70a981..9e799f7377d4c 100644 --- a/docs/reference/search/request/preference.asciidoc +++ b/docs/reference/search/request/preference.asciidoc @@ -1,55 +1,76 @@ [[search-request-preference]] === Preference -Controls a `preference` of which shard copies on which to execute the -search. By default, the operation is randomized among the available shard copies. +Controls a `preference` of which shard copies on which to execute the search. +By default, Elasticsearch selects from the available shard copies in an +unspecified order, taking the <> and +<> configuration into +account. However, it may sometimes be desirable to try and route certain +searches to certain sets of shard copies, for instance to make better use of +per-copy caches. The `preference` is a query string parameter which can be set to: [horizontal] -`_primary`:: - The operation will go and be executed only on the primary - shards. deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] +`_primary`:: + The operation will be executed only on primary shards. deprecated[6.1.0, + will be removed in 7.0. See the warning below for more information.] -`_primary_first`:: - The operation will go and be executed on the primary - shard, and if not available (failover), will execute on other shards. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] +`_primary_first`:: + The operation will be executed on primary shards if possible, but will fall + back to other shards if not. deprecated[6.1.0, will be removed in 7.0. See + the warning below for more information.] `_replica`:: - The operation will go and be executed only on a replica shard. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] + The operation will be executed only on replica shards. If there are multiple + replicas then the order of preference between them is unspecified. + deprecated[6.1.0, will be removed in 7.0. See the warning below for more + information.] `_replica_first`:: - The operation will go and be executed only on a replica shard, and if - not available (failover), will execute on other shards. - deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`] + The operation will be executed on replica shards if possible, but will fall + back to other shards if not. If there are multiple replicas then the order of + preference between them is unspecified. deprecated[6.1.0, will be removed in + 7.0. See the warning below for more information.] -`_local`:: - The operation will prefer to be executed on a local - allocated shard if possible. +`_only_local`:: + The operation will be executed only on shards allocated to the local + node. + +`_local`:: + The operation will be executed on shards allocated to the local node if + possible, and will fall back to other shards if not. `_prefer_nodes:abc,xyz`:: - Prefers execution on the nodes with the provided - node ids (`abc` or `xyz` in this case) if applicable. + The operation will be executed on nodes with one of the provided node + ids (`abc` or `xyz` in this case) if possible. If suitable shard copies + exist on more than one of the selected nodes then the order of + preference between these copies is unspecified. -`_shards:2,3`:: - Restricts the operation to the specified shards. (`2` - and `3` in this case). This preference can be combined with other - preferences but it has to appear first: `_shards:2,3|_local` +`_shards:2,3`:: + Restricts the operation to the specified shards. (`2` and `3` in this + case). This preference can be combined with other preferences but it + has to appear first: `_shards:2,3|_local` -`_only_nodes`:: - Restricts the operation to nodes specified in <> +`_only_nodes:abc*,x*yz,...`:: + Restricts the operation to nodes specified according to the + <>. If suitable shard copies exist on more + than one of the selected nodes then the order of preference between + these copies is unspecified. -Custom (string) value:: - A custom value will be used to guarantee that - the same shards will be used for the same custom value. This can help - with "jumping values" when hitting different shards in different refresh - states. A sample value can be something like the web session id, or the - user name. +Custom (string) value:: + Any value that does not start with `_`. If two searches both give the same + custom string value for their preference and the underlying cluster state + does not change then the same ordering of shards will be used for the + searches. This does not guarantee that the exact same shards will be used + each time: the cluster state, and therefore the selected shards, may change + for a number of reasons including shard relocations and shard failures, and + nodes may sometimes reject searches causing fallbacks to alternative nodes. + However, in practice the ordering of shards tends to remain stable for long + periods of time. A good candidate for a custom preference value is something + like the web session id or the user name. -For instance, use the user's session ID to ensure consistent ordering of results -for the user: +For instance, use the user's session ID `xyzabc123` as follows: [source,js] ------------------------------------------------ @@ -64,3 +85,22 @@ GET /_search?preference=xyzabc123 ------------------------------------------------ // CONSOLE +NOTE: The `_only_local` preference guarantees only to use shard copies on the +local node, which is sometimes useful for troubleshooting. All other options do +not _fully_ guarantee that any particular shard copies are used in a search, +and on a changing index this may mean that repeated searches may yield +different results if they are executed on different shard copies which are in +different refresh states. + +WARNING: The `_primary`, `_primary_first`, `_replica` and `_replica_first` are +deprecated as their use is not recommended. They do not help to avoid +inconsistent results that arise from the use of shards that have different +refresh states, and Elasticsearch uses synchronous replication so the primary +does not in general hold fresher data than its replicas. The `_primary_first` +and `_replica_first` preferences silently fall back to non-preferred copies if +it is not possible to search the preferred copies. The `_primary` and +`_replica` preferences will silently change their preferred shards if a replica +is promoted to primary, which can happen at any time. The `_primary` preference +can also put undesirable extra load on the primary shards. The cache-related +benefits of these options can also be obtained using `_only_nodes`, +`_prefer_nodes`, or a custom string value instead.