You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when a leader removes itself from the range, we let the first voter in the range descriptor campaign to elect a new leader without waiting out the election timeout. However, this is problematic for a couple of reasons: the first replica may lag or be unavailable (in which case we have to wait out the election timeout), and when we enable PreVote+CheckQuorum (#92088) the current PreVote campaign will likely fail since the other followers have heard from the leader recently.
We can address both of these issues by choosing an up-to-date designated follower when we propose the conf chance, including it in ConfChange.Context, and have that follower do a TimeoutNow campaign (bypassing prevote) once it applies the conf change.
Currently, when a leader removes itself from the range, we let the first voter in the range descriptor campaign to elect a new leader without waiting out the election timeout. However, this is problematic for a couple of reasons: the first replica may lag or be unavailable (in which case we have to wait out the election timeout), and when we enable PreVote+CheckQuorum (#92088) the current PreVote campaign will likely fail since the other followers have heard from the leader recently.
We can address both of these issues by choosing an up-to-date designated follower when we propose the conf chance, including it in
ConfChange.Context
, and have that follower do aTimeoutNow
campaign (bypassing prevote) once it applies the conf change.cockroach/pkg/kv/kvserver/replica_raft.go
Lines 2596 to 2602 in 87d6547
Jira issue: CRDB-28762
Epic CRDB-25199
The text was updated successfully, but these errors were encountered: