Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CoordinatorTests.testIncompatibleDiffResendsFullState #39345

Merged
merged 2 commits into from
Mar 11, 2019

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Feb 25, 2019

This test started failing since decreasing the leader and follower check timeouts (#38298). The reason is that the test was relying on the default publication timeout to come into effect before leader / follower check timeouts, which is now not always true anymore.

Closes #38867

@ywelsch ywelsch added >test Issues or PRs that are addressing/adding tests v7.0.0 :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.0.0 v7.2.0 labels Feb 25, 2019
@ywelsch ywelsch requested a review from DaveCTurner February 25, 2019 07:41
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

logger.info("--> healing {}", follower);
follower.heal();
logger.info("--> submitting second value to {}", leader);
leader.submitValue(randomLong());
cluster.stabilise(DEFAULT_CLUSTER_STATE_UPDATE_DELAY);
cluster.stabilise();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have an explicit duration here - we do not expect there to be an election so it should not be necessary to wait for a full stabilisation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to suggest such a minimal bound? The point of this test is to check the cluster state diffing logic, not determine any kind of upper bound on stabilization after a follower has been unresponsive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had time to do so, and would rather merge this PR than block on that.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

logger.info("--> healing {}", follower);
follower.heal();
logger.info("--> submitting second value to {}", leader);
leader.submitValue(randomLong());
cluster.stabilise(DEFAULT_CLUSTER_STATE_UPDATE_DELAY);
cluster.stabilise();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't had time to do so, and would rather merge this PR than block on that.

@ywelsch ywelsch merged commit 33e2cff into elastic:master Mar 11, 2019
ywelsch added a commit that referenced this pull request Mar 11, 2019
This test started failing since decreasing the leader and follower check timeouts (#38298). The
reason is that the test was relying on the default publication timeout to come into effect before
leader / follower check timeouts, which is now not always true anymore.

Closes #38867
ywelsch added a commit that referenced this pull request Mar 11, 2019
This test started failing since decreasing the leader and follower check timeouts (#38298). The
reason is that the test was relying on the default publication timeout to come into effect before
leader / follower check timeouts, which is now not always true anymore.

Closes #38867
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >test Issues or PRs that are addressing/adding tests v7.0.0-rc1 v7.2.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CoordinatorTests.testIncompatibleDiffResendsFullState failures
5 participants