Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: designate follower to campaign on Raft leader removal #104871

Closed
erikgrinaker opened this issue Jun 14, 2023 · 1 comment · Fixed by #104969
Closed

kvserver: designate follower to campaign on Raft leader removal #104871

erikgrinaker opened this issue Jun 14, 2023 · 1 comment · Fixed by #104969
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Jun 14, 2023

Currently, when a leader removes itself from the range, we let the first voter in the range descriptor campaign to elect a new leader without waiting out the election timeout. However, this is problematic for a couple of reasons: the first replica may lag or be unavailable (in which case we have to wait out the election timeout), and when we enable PreVote+CheckQuorum (#92088) the current PreVote campaign will likely fail since the other followers have heard from the leader recently.

We can address both of these issues by choosing an up-to-date designated follower when we propose the conf chance, including it in ConfChange.Context, and have that follower do a TimeoutNow campaign (bypassing prevote) once it applies the conf change.

// If the leader is no longer in the descriptor but we are the first voter,
// campaign.
_, leaderStillThere := desc.GetReplicaDescriptorByID(roachpb.ReplicaID(st.Lead))
if !leaderStillThere && storeID == desc.Replicas().VoterDescriptors()[0].StoreID {
log.VEventf(ctx, 3, "leader got removed by conf change")
return true
}

Jira issue: CRDB-28762

Epic CRDB-25199

@erikgrinaker erikgrinaker added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication labels Jun 14, 2023
@erikgrinaker erikgrinaker self-assigned this Jun 14, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 14, 2023

cc @cockroachdb/replication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant