Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: let learners vote #10998

Merged
merged 1 commit into from
Aug 8, 2019
Merged

raft: let learners vote #10998

merged 1 commit into from
Aug 8, 2019

Conversation

tbg
Copy link
Contributor

@tbg tbg commented Aug 7, 2019

It turns out that that learners must be allowed to cast votes.

This seems counter- intuitive but is necessary in the situation in which
a learner has been promoted (i.e. is now a voter) but has not learned
about this yet.

For example, consider a group in which id=1 is a learner and id=2 and
id=3 are voters. A configuration change promoting 1 can be committed on
the quorum {2,3} without the config change being appended to the
learner's log. If the leader (say 2) fails, there are de facto two
voters remaining. Only 3 can win an election (due to its log containing
all committed entries), but to do so it will need 1 to vote. But 1
considers itself a learner and will continue to do so until 3 has
stepped up as leader, replicates the conf change to 1, and 1 applies it.

Ultimately, by receiving a request to vote, the learner realizes that
the candidate believes it to be a voter, and that it should act
accordingly. The candidate's config may be stale, too; but in that case
it won't win the election, at least in the absence of the bug discussed
in:
#7625 (comment).

PS I have a work-in-progress PR that makes these sorts of interactions
much more testable and I'm planning to add additional test coverage
when it lands.

PS2: note that a similar behavior already holds for considering entries committed.
This decision is made by the leader based on its view of the configuration. If a node
that considers itself a learner acks a log entry, and it is in fact a voter in the leader's
view, then that ack will count for the commit quorum for that index (and always has).
This is necessary, and so is this change.

It turns out that that learners must be allowed to cast votes.

This seems counter- intuitive but is necessary in the situation in which
a learner has been promoted (i.e. is now a voter) but has not learned
about this yet.

For example, consider a group in which id=1 is a learner and id=2 and
id=3 are voters. A configuration change promoting 1 can be committed on
the quorum `{2,3}` without the config change being appended to the
learner's log. If the leader (say 2) fails, there are de facto two
voters remaining. Only 3 can win an election (due to its log containing
all committed entries), but to do so it will need 1 to vote. But 1
considers itself a learner and will continue to do so until 3 has
stepped up as leader, replicates the conf change to 1, and 1 applies it.

Ultimately, by receiving a request to vote, the learner realizes that
the candidate believes it to be a voter, and that it should act
accordingly. The candidate's config may be stale, too; but in that case
it won't win the election, at least in the absence of the bug discussed
in:
etcd-io#7625 (comment).
@tbg tbg requested a review from bdarnell August 7, 2019 10:04
@tbg
Copy link
Contributor Author

tbg commented Aug 7, 2019

cc @darinpp and @danhhz

@codecov-io
Copy link

Codecov Report

Merging #10998 into master will increase coverage by 0.22%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10998      +/-   ##
==========================================
+ Coverage   63.75%   63.97%   +0.22%     
==========================================
  Files         401      401              
  Lines       37551    37554       +3     
==========================================
+ Hits        23939    24026      +87     
+ Misses      12003    11890     -113     
- Partials     1609     1638      +29
Impacted Files Coverage Δ
raft/raft.go 90.69% <ø> (-0.05%) ⬇️
client/keys.go 59.79% <0%> (-31.66%) ⬇️
auth/simple_token.go 63.41% <0%> (-23.58%) ⬇️
clientv3/naming/grpc.go 63.15% <0%> (-12.29%) ⬇️
proxy/grpcproxy/register.go 69.44% <0%> (-11.12%) ⬇️
clientv3/namespace/watch.go 87.87% <0%> (-6.07%) ⬇️
pkg/types/set.go 94.25% <0%> (-5.75%) ⬇️
raft/tracker/inflights.go 91.83% <0%> (-4.09%) ⬇️
clientv3/leasing/util.go 95% <0%> (-3.34%) ⬇️
clientv3/leasing/txn.go 88.09% <0%> (-3.18%) ⬇️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d85aa1...c30c2e3. Read the comment docs.

@tbg
Copy link
Contributor Author

tbg commented Aug 8, 2019

Thanks @nvanbenschoten.

@tbg tbg merged commit a41bd30 into etcd-io:master Aug 8, 2019
@tbg tbg deleted the learners-vote branch August 8, 2019 09:26
@pjhdld
Copy link

pjhdld commented Jan 19, 2020

It turns out that that learners must be allowed to cast votes.

This seems counter- intuitive but is necessary in the situation in which
a learner has been promoted (i.e. is now a voter) but has not learned
about this yet.

For example, consider a group in which id=1 is a learner and id=2 and
id=3 are voters. A configuration change promoting 1 can be committed on
the quorum {2,3} without the config change being appended to the
learner's log. If the leader (say 2) fails, there are de facto two
voters remaining. Only 3 can win an election (due to its log containing
all committed entries), but to do so it will need 1 to vote. But 1
considers itself a learner and will continue to do so until 3 has
stepped up as leader, replicates the conf change to 1, and 1 applies it.

Ultimately, by receiving a request to vote, the learner realizes that
the candidate believes it to be a voter, and that it should act
accordingly. The candidate's config may be stale, too; but in that case
it won't win the election, at least in the absence of the bug discussed
in:
#7625 (comment).

PS I have a work-in-progress PR that makes these sorts of interactions
much more testable and I'm planning to add additional test coverage
when it lands.

PS2: note that a similar behavior already holds for considering entries committed.
This decision is made by the leader based on its view of the configuration. If a node
that considers itself a learner acks a log entry, and it is in fact a voter in the leader's
view, then that ack will count for the commit quorum for that index (and always has).
This is necessary, and so is this change.

@tbg hello,I'm a little confused about learners being able to vote without elevated privileges and hope to help me out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants