-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: support raft learner in etcd - part 2 #10727
Conversation
Added learner field to endpoint status API.
Hardcoded allowed rpc for learner node. Added filtering in grpc interceptor to check if rpc is allowed for learner node.
Adding TestKVForLearner. Also adding test utility functions for clientv3 integration tests.
1. Maintenance API MoveLeader() returns ErrBadLeaderTransferee if transferee does not exist or is raft learner. 2. etcdserver TransferLeadership() only choose voting member as transferee.
Adding integration test TestMoveLeaderToLearnerError, which ensures that leader transfer to learner member will fail.
Adding integration test TestTransferLeadershipWithLearner, which ensures that TransferLeadership does not timeout due to learner is automatically picked by leader as transferee.
Fixes 'gosimple' test.
Fixes go 'unconvert' test.
Fixes TestMemberAddForLearner and TestMemberPromoteForLearner.
Adding delay in the test for the newly started learner member to catch up applying config change entries in raft log.
Use: "promote <memberID>", | ||
Short: "Promotes a non-voting member in the cluster", | ||
Long: `Promotes a non-voting learner member to a voting one in the cluster. | ||
`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a new line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not write this part of the code. But it seems consistent with the existing NewMemberListCommand
.
etcdserver/api/membership/errors.go
Outdated
ErrIDNotFound = errors.New("membership: ID not found") | ||
ErrPeerURLexists = errors.New("membership: peerURL exists") | ||
ErrMemberNotLearner = errors.New("membership: can only promote a learner member") | ||
ErrLearnerNotReady = errors.New("membership: can only promote a learner member which catches up with leader") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is in sync with the leader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
@@ -40,6 +40,8 @@ var ( | |||
ErrGRPCMemberNotEnoughStarted = status.New(codes.FailedPrecondition, "etcdserver: re-configuration failed due to not enough started members").Err() | |||
ErrGRPCMemberBadURLs = status.New(codes.InvalidArgument, "etcdserver: given member URLs are invalid").Err() | |||
ErrGRPCMemberNotFound = status.New(codes.NotFound, "etcdserver: member not found").Err() | |||
ErrGRPCMemberNotLearner = status.New(codes.FailedPrecondition, "etcdserver: can only promote a learner member").Err() | |||
ErrGRPCLearnerNotReady = status.New(codes.FailedPrecondition, "etcdserver: can only promote a learner member which catches up with leader").Err() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is in sync with leader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reuse the previously defined error message here? maybe define the error message as a const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reuse the previously defined error message here? maybe define the error message as a const
This is just following the exiting pattern. If we want to change it, it should be a separate PR for all existing errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is in sync with leader
sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this one, a const is better since we repeat the same message twice. but it might also apply to other error messages. i am not sure.
etcdserver/api/v3rpc/util.go
Outdated
@@ -116,3 +120,15 @@ func isClientCtxErr(ctxErr error, err error) bool { | |||
} | |||
return false | |||
} | |||
|
|||
// in v3.4, learner is allowed to serve serializable read and endpoint status | |||
func isRPCEnabledForLearner(req interface{}) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isRPCSupportedForLeaner ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
// ConfigChangeContext represents a context for confChange. | ||
type ConfigChangeContext struct { | ||
Member | ||
IsPromote bool `json:"isPromote"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment on this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
if urls[u] { | ||
return ErrPeerURLexists | ||
// A ConfChangeAddNode to a existing learner node promotes it to a voting member. | ||
if confChangeContext.IsPromote { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move the validation to the top of this func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do the validation first, then do the context encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to encode confChangeContext first, which will tells us if the config change is for promoting a existing member, or for adding a new member. The validation logic is different for these two scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. sgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then, can you move the decoding thing even before the urls construction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, urls construction should be moved to later part of the logic. It is only needed for adding a new member case, not needed for promoting.
etcdserver/api/membership/cluster.go
Outdated
@@ -693,3 +735,44 @@ func mustDetectDowngrade(lg *zap.Logger, cv *semver.Version) { | |||
} | |||
} | |||
} | |||
|
|||
// IsLearner returns if the local member is raft learner | |||
func (c *RaftCluster) IsLearner() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsLocalMemberLearner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
etcdserver/server.go
Outdated
@@ -1435,20 +1440,20 @@ func (s *EtcdServer) TransferLeadership() error { | |||
return nil | |||
} | |||
|
|||
if !s.isMultiNode() { | |||
if s.cluster == nil || len(s.cluster.VotingMemberIDs()) <= 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably still encapsulate this logic into a func.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
etcdserver/server.go
Outdated
// TODO add more checks whether the member can be promoted. | ||
// like learner progress check or if cluster is ready to promote a learner | ||
// this is an example to get progress | ||
fmt.Printf("raftStatus, %#v\n", raftStatus()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not simply use fmt.printf as logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is unfinished. This line should be removed for now. The actual implementation is in later commit (not included in part 2 PR): dc79587
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but checking this before the apply phase is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but checking this before the apply phase is ok.
Not sure I understand. The member promote request should be rejected (if the reason is learner not ready) before it is appended to raft log. I think it is too late if we check learner progress in apply stage. In other words, applying a particular conf change raft entry should not depend on the progress of learner node in cluster. Some node may choose to apply this entry, whereas some node may not - depending on when they started applying this entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that is what i mean. checking this at the apply phase will not work since the cluster might not have a consistent view on progress unless the leader keep on propagating this information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is unfinished. This line should be removed for now. The actual implementation is in later commit (not included in part 2 PR):
can you just make this a noop or a TODO panic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it is noop - this checking function always return nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will print something out, can we remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this print should be removed.
i did not give a look at all the tests changes. but the core path is good in general. |
defer cli.Close() | ||
|
||
// waiting for learner member to catch up applying the config change entries in raft log. | ||
time.Sleep(3 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not good to make this kind of timing assumption in general. we tried to get rid of all time.sleep in tests before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is replaced in later commits with server's ready notify channel. 90956f7 (not included in this part 2 PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a todo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
tests look good too. |
I'll create an additional commit to address all the comments. |
@xiang90 Pushed an additional commit to address the comments. PTAL. |
lgtm |
Part 2 of #10645. Continuation of #10725.
This PR incldues the following 16 commits from #10645. The last 6 commits are just formatting and bug fixes for the earlier 10 commits.
This PR is rebased to part 1 (#10725). Since we modified some function signatures in the end of part 1, I have to edit some of the commits in this PR so that they still make sense. - A little bit history rewrite of the learner feature branch. During this process, the following 2 commits (out of the total 16) are no longer needed, so they are dropped.
cc @xiang90