Wait for shards to be active after closing indices #38854

tlrx · 2019-02-13T16:37:20Z

Note: this PR will be merged in the replicated-closed-indices feature branch

This pull request changes the Close Index API to add a wait_for_active_shards parameter that allows to wait for shards of closed indices to be active before returning a response.

Relates #33888

elasticmachine · 2019-02-13T16:37:22Z

Pinging @elastic/es-distributed

tlrx · 2019-02-13T16:38:17Z

rest-api-spec/src/main/resources/rest-api-spec/test/cat.indices/10_basic.yml

@@ -8,6 +8,9 @@
      $body: |
               /^$/

+---
+"Test cat indices output":
+


This test is split because replicated closed indices can report docs stats while non-replicated closes indices can't

server/src/main/java/org/elasticsearch/action/admin/indices/close/CloseIndexRequest.java

server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexStateService.java

clusters

tlrx · 2019-02-15T14:54:44Z

Thanks @ywelsch. I picked up my work on another change so that CloseIndexResponse extends ShardsAcknowledgedResponse in this PR.

I also reverted the cat.indices YAML test and adapted it to make it work on all versions because this test also runs on mixed cluster version and depending of the shards allocation it may or may not reports some stats.

ywelsch

LGTM

tlrx · 2019-02-26T17:35:47Z

Thanks @ywelsch. I merged this with a small adaptation of the CloseFollowerIndexIT test.

Before this change, closed indexes were simply not replicated. It was therefore possible to close an index and then decommission a data node without knowing that this data node contained shards of the closed index, potentially leading to data loss. Shards of closed indices were not completely taken into account when balancing the shards within the cluster, or automatically replicated through shard copies, and they were not easily movable from node A to node B using APIs like Cluster Reroute without being fully reopened and closed again. This commit changes the logic executed when closing an index, so that its shards are not just removed and forgotten but are instead reinitialized and reallocated on data nodes using an engine implementation which does not allow searching or indexing, which has a low memory overhead (compared with searchable/indexable opened shards) and which allows shards to be recovered from peer or promoted as primaries when needed. This new closing logic is built on top of the new Close Index API introduced in 6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before closing them, and closing an index on a 8.0 cluster will reinitialize the index shards and therefore impact the cluster health. Some APIs have been adapted to make them work with closed indices: - Cluster Health API - Cluster Reroute API - Cluster Allocation Explain API - Recovery API - Cat Indices - Cat Shards - Cat Health - Cat Recovery This commit contains all the following changes (most recent first): * c6c42a1 Adapt NoOpEngineTests after #39006 * 3f9993d Wait for shards to be active after closing indices (#38854) * 5e7a428 Adapt the Cluster Health API to closed indices (#39364) * 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767) * 71f5c34 Recover closed indices after a full cluster restart (#39249) * 4db7fd9 Adapt the Recovery API for closed indices (#38421) * 4fd1bb2 Adapt more tests suites to closed indices (#39186) * 0519016 Add replica to primary promotion test for closed indices (#39110) * b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631) * c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955) * 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex() * e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329) * cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327) * b9becdd Adapt testPendingTasks() for replicated closed indices (#38326) * 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024) * e53a9be Fix compilation error in IndexShardIT after merge with master * cae4155 Relax NoOpEngine constraints (#37413) * 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes * c63fd69 [RCI] Add NoOpEngine for closed indices (#33903) Relates to #33888

This commit changes the Close Index API to add a `wait_for_active_shards` parameter that allows to wait for shards of closed indices to be active before returning a response. Relates elastic#33888

Backport support for replicating closed indices (#39499) Before this change, closed indexes were simply not replicated. It was therefore possible to close an index and then decommission a data node without knowing that this data node contained shards of the closed index, potentially leading to data loss. Shards of closed indices were not completely taken into account when balancing the shards within the cluster, or automatically replicated through shard copies, and they were not easily movable from node A to node B using APIs like Cluster Reroute without being fully reopened and closed again. This commit changes the logic executed when closing an index, so that its shards are not just removed and forgotten but are instead reinitialized and reallocated on data nodes using an engine implementation which does not allow searching or indexing, which has a low memory overhead (compared with searchable/indexable opened shards) and which allows shards to be recovered from peer or promoted as primaries when needed. This new closing logic is built on top of the new Close Index API introduced in 6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before closing them, and closing an index on a 8.0 cluster will reinitialize the index shards and therefore impact the cluster health. Some APIs have been adapted to make them work with closed indices: - Cluster Health API - Cluster Reroute API - Cluster Allocation Explain API - Recovery API - Cat Indices - Cat Shards - Cat Health - Cat Recovery This commit contains all the following changes (most recent first): * c6c42a1 Adapt NoOpEngineTests after #39006 * 3f9993d Wait for shards to be active after closing indices (#38854) * 5e7a428 Adapt the Cluster Health API to closed indices (#39364) * 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767) * 71f5c34 Recover closed indices after a full cluster restart (#39249) * 4db7fd9 Adapt the Recovery API for closed indices (#38421) * 4fd1bb2 Adapt more tests suites to closed indices (#39186) * 0519016 Add replica to primary promotion test for closed indices (#39110) * b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631) * c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955) * 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex() * e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329) * cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327) * b9becdd Adapt testPendingTasks() for replicated closed indices (#38326) * 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024) * e53a9be Fix compilation error in IndexShardIT after merge with master * cae4155 Relax NoOpEngine constraints (#37413) * 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes * c63fd69 [RCI] Add NoOpEngine for closed indices (#33903) Relates to #33888

Wait for active shards when closing indices

4f896d0

tlrx added >enhancement :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Feb 13, 2019

tlrx commented Feb 13, 2019

View reviewed changes

ywelsch suggested changes Feb 15, 2019

View reviewed changes

server/src/main/java/org/elasticsearch/action/admin/indices/close/CloseIndexRequest.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexStateService.java Outdated Show resolved Hide resolved

tlrx added 3 commits February 15, 2019 13:48

Improve CloseIndexResponse

b029048

Revert change in cat.indices

9dbd7ac

Adapt cat.indices YAML test to pass on replicated/non replicated

75479f6

clusters

tlrx requested a review from ywelsch February 15, 2019 15:09

tlrx added 2 commits February 18, 2019 14:37

Adapt CloseFollowerIndexStepTests

109a3bd

Merge branch 'replicated-closed-indices' into rci-wait-for-active-shards

9312f33

ywelsch approved these changes Feb 25, 2019

View reviewed changes

Merge branch 'replicated-closed-indices' into rci-wait-for-active-shards

4a75e7f

tlrx mentioned this pull request Feb 26, 2019

Replicate closed indices #33888

Closed

50 tasks

Fix CloseFollowerIndexIT

e559468

tlrx merged commit 3f9993d into elastic:replicated-closed-indices Feb 26, 2019

tlrx deleted the rci-wait-for-active-shards branch February 26, 2019 17:34

tlrx mentioned this pull request Feb 28, 2019

Add support for replicating closed indices #39499

Merged

tlrx mentioned this pull request Dec 17, 2020

Undocumented(?) change in default for wait_for_active_shards on close index requests #66419

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wait for shards to be active after closing indices #38854

Wait for shards to be active after closing indices #38854

tlrx commented Feb 13, 2019 •

edited by ywelsch

Loading

elasticmachine commented Feb 13, 2019

tlrx Feb 13, 2019

tlrx commented Feb 15, 2019

ywelsch left a comment

tlrx commented Feb 26, 2019

Wait for shards to be active after closing indices #38854

Wait for shards to be active after closing indices #38854

Conversation

tlrx commented Feb 13, 2019 • edited by ywelsch Loading

elasticmachine commented Feb 13, 2019

tlrx Feb 13, 2019

Choose a reason for hiding this comment

tlrx commented Feb 15, 2019

ywelsch left a comment

Choose a reason for hiding this comment

tlrx commented Feb 26, 2019

tlrx commented Feb 13, 2019 •

edited by ywelsch

Loading