Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] CCR: FollowerFailOverIT.testFailOverOnFollower failure #35403

Closed
tvernum opened this issue Nov 9, 2018 · 6 comments
Closed

[CI] CCR: FollowerFailOverIT.testFailOverOnFollower failure #35403

tvernum opened this issue Nov 9, 2018 · 6 comments
Assignees
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI

Comments

@tvernum
Copy link
Contributor

tvernum commented Nov 9, 2018

Doesn't reproduce for me.
Could just be a timeout issue, but I don't know the CCR code well enough to make that call.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/70/consoleFull

./gradlew :x-pack:plugin:ccr:internalClusterTest \
  -Dtests.seed=71D42FEBD6289D1E \
  -Dtests.class=org.elasticsearch.xpack.ccr.FollowerFailOverIT \
  -Dtests.method="testFailOverOnFollower" \
  -Dtests.security.manager=true \
  -Dtests.jvm.argline="-XX:-UseConcMarkSweepGC -XX:+UseG1GC" \
  -Dtests.locale=es-HN \
  -Dtests.timezone=America/Thule \
  -Dcompiler.java=11 \
  -Druntime.java=8
03:14:10   1> -----node_id[EzmV_EamS9CrPbyDEdvQNw][V]
03:14:10   1> --------[follower-index][0], node[EzmV_EamS9CrPbyDEdvQNw], [P], s[STARTED], a[id=Iqbazc8iQ9-6RR12ZZhtkg]
03:14:10   1> -----node_id[vtkxFTkwTReEy-_pIHq1aQ][V]
03:14:10   1> ---- unassigned
03:14:10   1> tasks: (0):
03:14:10   1> [2018-11-08T23:13:49,827][INFO ][o.e.x.c.a.ShardFollowNodeTask] [followerd4] [follower-index][0] shard follow task has been stopped
03:14:10   1> [2018-11-08T23:13:59,856][INFO ][o.e.c.m.MetaDataDeleteIndexService] [leaderm2] [leader-index/LOXtlSQRRIecv-2pJuHHlg] deleting index
03:14:10   1> [2018-11-08T23:13:59,892][INFO ][o.e.c.m.MetaDataDeleteIndexService] [followerm0] [follower-index/vTH2BwSvRTSC4CrI_oFbMA] deleting index
03:14:10   1> [2018-11-08T23:13:59,928][INFO ][o.e.x.c.FollowerFailOverIT] [testFailOverOnFollower] after test
03:14:10 FAILURE 42.3s J2 | FollowerFailOverIT.testFailOverOnFollower <<< FAILURES!
03:14:10    > Throwable #1: java.lang.AssertionError: timed out waiting for green state
03:14:10    > 	at org.elasticsearch.xpack.CcrIntegTestCase.ensureColor(CcrIntegTestCase.java:278)
03:14:10    > 	at org.elasticsearch.xpack.CcrIntegTestCase.ensureFollowerGreen(CcrIntegTestCase.java:253)
03:14:10    > 	at org.elasticsearch.xpack.ccr.FollowerFailOverIT.testFailOverOnFollower(FollowerFailOverIT.java:99)
03:14:10    > 	at java.lang.Thread.run(Thread.java:748)Throwable #2: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 	at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:848)
03:14:10    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:822)
03:14:10    > 	at org.elasticsearch.test.InternalTestCluster.assertSeqNos(InternalTestCluster.java:1278)
03:14:10    > 	at org.elasticsearch.xpack.CcrIntegTestCase.afterTest(CcrIntegTestCase.java:155)
03:14:10    > 	at java.lang.Thread.run(Thread.java:748)
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10    > 	Suppressed: java.lang.AssertionError: [follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[reason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]] seq_no_stats mismatch
03:14:10    > Expected: <SeqNoStats{maxSeqNo=257, localCheckpoint=140, globalCheckpoint=140}>
03:14:10    >      but: was <SeqNoStats{maxSeqNo=257, localCheckpoint=-1, globalCheckpoint=140}>
03:14:10    > 		at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
03:14:10    > 		at org.elasticsearch.test.InternalTestCluster.lambda$assertSeqNos$7(InternalTestCluster.java:1311)
03:14:10    > 		at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
03:14:10    > 		... 38 more
03:14:10   1> [2018-11-08T23:14:00,051][INFO ][o.e.x.c.FollowerFailOverIT] [testAddNewReplicasOnFollower] before test
03:14:10   1> [2018-11-08T23:14:00,051][INFO ][o.e.n.Node               ] [testAddNewReplicasOnFollower] stopping ...
@tvernum tvernum added >test-failure Triaged test failures from CI :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Nov 9, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@martijnvg
Copy link
Member

This test tests that ccr keeps following a leader index; while a node holding the primary shard of the follower index is restarted. After the restart we expect that the follower index gets back into a green state, but that doesn't happen. The replicas (used to be primaries) are unable to recover from the primary shards:

1> cluster uuid: o2L8nhvHQL-wy93CmcS5Ag
  1> version: 16
  1> state uuid: j9PkPvfbQc68EBzg6VChnA
  1> from_diff: false
  1> meta data version: 10
  1>    [follower-index/vTH2BwSvRTSC4CrI_oFbMA]: v[6], mv[1], sv[1]
  1>       0: p_term [2], isa_ids [Iqbazc8iQ9-6RR12ZZhtkg]
  1> metadata customs:
  1>    persistent_tasks: {"last_allocation_id":1,"tasks":[{"id":"vTH2BwSvRTSC4CrI_oFbMA-0","task":{"xpack/ccr/shard_follow_task":{"params":{"remote_cluster":"leader_c
luster","follow_shard_index":"follower-index","follow_shard_index_uuid":"vTH2BwSvRTSC4CrI_oFbMA","follow_shard_shard":0,"leader_shard_index":"leader-index","leader_sha
rd_index_uuid":"LOXtlSQRRIecv-2pJuHHlg","leader_shard_shard":0,"max_read_request_operation_count":1455,"max_read_request_size":"1461kb","max_outstanding_read_requests"
:4,"max_write_request_operation_count":1908,"max_write_request_size":"1259kb","max_outstanding_write_requests":1,"max_write_buffer_count":2147483647,"max_write_buffer_
size":"512mb","max_retry_delay":"10ms","read_poll_timeout":"10ms","headers":{}}}},"allocation_id":1,"assignment":{"executor_node":"vtkxFTkwTReEy-_pIHq1aQ","explanation
":""}}]}   index-graveyard: IndexGraveyard[[]]
  1> nodes: 
  1>    {followerm0}{ceKigf_eS6q1bSD9PjeF9Q}{KLsjQTBSTwmmfO68Xs3SSw}{127.0.0.1}{127.0.0.1:44381}{xpack.installed=true}, master
  1>    {followerd6}{EzmV_EamS9CrPbyDEdvQNw}{VbOh7rGgSWCn2_6QJL0SzQ}{127.0.0.1}{127.0.0.1:34535}{xpack.installed=true}
  1>    {followerm5}{V9A0dTNNQpaja-ayVER8Wg}{VyMKmg2yR8azsg4_aNaLFA}{127.0.0.1}{127.0.0.1:46385}{xpack.installed=true}
  1>    {followerm1}{nD4QckFISemqngHgWPU4NA}{C_BFOjUxTC6USqa0pXO9dA}{127.0.0.1}{127.0.0.1:34775}{xpack.installed=true}
  1>    {followerd3}{PlW87Mz-R7qy3LrIOT4BMg}{HHktN4lsTiS58tQq0oUk-w}{127.0.0.1}{127.0.0.1:37717}{xpack.installed=true}
  1>    {followerm2}{f1SK4fs1RO6x1TrcGbFF9Q}{io86LtElTC-_2bJ-K2mfvw}{127.0.0.1}{127.0.0.1:35559}{xpack.installed=true}
  1>    {followerd4}{vtkxFTkwTReEy-_pIHq1aQ}{48GVEgaPTji2rBhbHbJQIA}{127.0.0.1}{127.0.0.1:40783}{xpack.installed=true}
  1> routing_table (version 7):
  1> -- index [[follower-index/vTH2BwSvRTSC4CrI_oFbMA]]
  1> ----shard_id [follower-index][0]
  1> --------[follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[re
ason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]]
  1> --------[follower-index][0], node[EzmV_EamS9CrPbyDEdvQNw], [P], s[STARTED], a[id=Iqbazc8iQ9-6RR12ZZhtkg]
  1> routing_nodes:
  1> -----node_id[PlW87Mz-R7qy3LrIOT4BMg][V]
  1> --------[follower-index][0], node[PlW87Mz-R7qy3LrIOT4BMg], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=ksgilEsJToy6WtU0qNPM0Q], unassigned_info[[re
ason=NODE_LEFT], at[2018-11-09T03:13:19.536Z], delayed=true, details[node_left[PlW87Mz-R7qy3LrIOT4BMg]], allocation_status[no_attempt]]
  1> -----node_id[EzmV_EamS9CrPbyDEdvQNw][V]
  1> --------[follower-index][0], node[EzmV_EamS9CrPbyDEdvQNw], [P], s[STARTED], a[id=Iqbazc8iQ9-6RR12ZZhtkg]
  1> -----node_id[vtkxFTkwTReEy-_pIHq1aQ][V]
  1> ---- unassigned
  1> tasks: (0):

Not sure why these replica shards are not able to recover.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Nov 30, 2018
…ests

Some tests kill nodes and otherwise it would take 60s by default
for replicas to get allocated and that is longer than we wait
for getting in a green state in tests.

Relates to elastic#35403
martijnvg added a commit that referenced this issue Nov 30, 2018
…ests

Some tests kill nodes and otherwise it would take 60s by default
for replicas to get allocated and that is longer than we wait
for getting in a green state in tests.

Relates to #35403
martijnvg added a commit that referenced this issue Nov 30, 2018
…ests

Some tests kill nodes and otherwise it would take 60s by default
for replicas to get allocated and that is longer than we wait
for getting in a green state in tests.

Relates to #35403
martijnvg added a commit that referenced this issue Nov 30, 2018
…ests

Some tests kill nodes and otherwise it would take 60s by default
for replicas to get allocated and that is longer than we wait
for getting in a green state in tests.

Relates to #35403
@martijnvg
Copy link
Member

I pushed a fix for this test failure. Turns out it was a test issue. Due to delayed allocation the replica shards never allocated in time. I will close this issue if I have confirmed that these tests have stopped failing.

@martijnvg
Copy link
Member

Closing, this test hasn't failed since above change was pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

6 participants