Skip to content

Commit

Permalink
Update resiliency page (#17586)
Browse files Browse the repository at this point in the history
#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.
  • Loading branch information
bleskes committed Apr 7, 2016
1 parent c0ebce0 commit 8eee28e
Showing 1 changed file with 23 additions and 21 deletions.
44 changes: 23 additions & 21 deletions docs/resiliency/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -94,16 +94,35 @@ space. The following issues have been identified:

Other safeguards are tracked in the meta-issue {GIT}11511[#11511].


[float]
=== Relocating shards omitted by reporting infrastructure (STATUS: ONGOING)

Indices stats and indices segments requests reach out to all nodes that have shards of that index. Shards that have relocated from a node
while the stats request arrives will make that part of the request fail and are just ignored in the overall stats result. {GIT}13719[#13719]

[float]
=== Jepsen Test Failures (STATUS: ONGOING)

We have increased our test coverage to include scenarios tested by Jepsen. We make heavy use of randomization to expand on the scenarios that can be tested and to introduce new error conditions. You can follow the work on the master branch of the https://github.com/elastic/elasticsearch/blob/master/core/src/test/java/org/elasticsearch/discovery/DiscoveryWithServiceDisruptionsIT.java[`DiscoveryWithServiceDisruptionsIT` class], where we will add more tests as time progresses.

[float]
=== Document guarantees and handling of failure (STATUS: ONGOING)

This status page is a start, but we can do a better job of explicitly documenting the processes at work in Elasticsearch, and what happens in the case of each type of failure. The plan is to have a test case that validates each behavior under simulated conditions. Every test will document the expected results, the associated test code and an explicit PASS or FAIL status for each simulated case.

== Unreleased

[float]
=== Loss of documents during network partition (STATUS: ONGOING)
=== Loss of documents during network partition (STATUS: UNRELEASED, v5.0.0)

If a network partition separates a node from the master, there is some window of time before the node detects it. The length of the window is dependent on the type of the partition. This window is extremely small if a socket is broken. More adversarial partitions, for example, silently dropping requests without breaking the socket can take longer (up to 3x30s using current defaults).

If the node hosts a primary shard at the moment of partition, and ends up being isolated from the cluster (which could have resulted in {GIT}2488[split-brain] before), some documents that are being indexed into the primary may be lost if they fail to reach one of the allocated replicas (due to the partition) and that replica is later promoted to primary by the master ({GIT}7572[#7572]).
To prevent this situation, the primary needs to wait for the master to acknowledge replica shard failures before acknowledging the write to the client. {GIT}14252[#14252]

[float]
=== Safe primary relocations (STATUS: ONGOING)
=== Safe primary relocations (STATUS: UNRELEASED, v5.0.0)

When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active. As
cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed the
Expand All @@ -117,23 +136,7 @@ on the relocation target, each of the nodes believes the other to be the active
chasing the primary being quickly sent back and forth between the nodes, potentially making them both go OOM. {GIT}12573[#12573]

[float]
=== Relocating shards omitted by reporting infrastructure (STATUS: ONGOING)

Indices stats and indices segments requests reach out to all nodes that have shards of that index. Shards that have relocated from a node
while the stats request arrives will make that part of the request fail and are just ignored in the overall stats result. {GIT}13719[#13719]

[float]
=== Jepsen Test Failures (STATUS: ONGOING)

We have increased our test coverage to include scenarios tested by Jepsen. We make heavy use of randomization to expand on the scenarios that can be tested and to introduce new error conditions. You can follow the work on the master branch of the https://github.com/elastic/elasticsearch/blob/master/core/src/test/java/org/elasticsearch/discovery/DiscoveryWithServiceDisruptionsIT.java[`DiscoveryWithServiceDisruptionsIT` class], where we will add more tests as time progresses.

[float]
=== Document guarantees and handling of failure (STATUS: ONGOING)

This status page is a start, but we can do a better job of explicitly documenting the processes at work in Elasticsearch, and what happens in the case of each type of failure. The plan is to have a test case that validates each behavior under simulated conditions. Every test will document the expected results, the associated test code and an explicit PASS or FAIL status for each simulated case.

[float]
=== Do not allow stale shards to automatically be promoted to primary (STATUS: ONGOING, v5.0.0)
=== Do not allow stale shards to automatically be promoted to primary (STATUS: UNRELEASED, v5.0.0)

In some scenarios, after the loss of all valid copies, a stale replica shard can be automatically assigned as a primary, preferring old data
to no data at all ({GIT}14671[#14671]). This can lead to a loss of acknowledged writes if the valid copies are not lost but are rather
Expand All @@ -143,7 +146,7 @@ for one of the good shard copies to reappear. In case where all good copies are
stale shard copy.

[float]
=== Make index creation resilient to index closing and full cluster crashes (STATUS: ONGOING, v5.0.0)
=== Make index creation resilient to index closing and full cluster crashes (STATUS: UNRELEASED, v5.0.0)

Recovering an index requires a quorum (with an exception for 2) of shard copies to be available to allocate a primary. This means that
a primary cannot be assigned if the cluster dies before enough shards have been allocated ({GIT}9126[#9126]). The same happens if an index
Expand All @@ -153,7 +156,6 @@ recover an index in the presence of a single shard copy. Allocation IDs can also
but none of the shards have been started. If such an index was inadvertently closed before at least one shard could be started, a fresh
shard will be allocated upon reopening the index.

== Unreleased

[float]
=== Use two phase commit for Cluster State publishing (STATUS: UNRELEASED, v5.0.0)
Expand Down

0 comments on commit 8eee28e

Please sign in to comment.