-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Etcd downgrade] Implement downgrade validate, enable and cancel #11801
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11801 +/- ##
==========================================
- Coverage 66.23% 65.82% -0.41%
==========================================
Files 403 403
Lines 36950 37018 +68
==========================================
- Hits 24473 24368 -105
- Misses 10991 11160 +169
- Partials 1486 1490 +4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yuchen! I added few comments.
etcdserver/errors.go
Outdated
ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version") | ||
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job") | ||
ErrIsNotDowngrading = errors.New("etcdserver: cluster is not downgrading") | ||
ErrUnknownDowngradeAction = errors.New("etcdserver: unknown downgrade action") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this basically ErrUnknownMethod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be. I changed to use ErrUnknownMethod.
etcdserver/util.go
Outdated
@@ -171,3 +172,17 @@ func warnOfExpensiveGenericRequest(lg *zap.Logger, now time.Time, reqStringer fm | |||
func isNil(msg proto.Message) bool { | |||
return msg == nil || reflect.ValueOf(msg).IsNil() | |||
} | |||
|
|||
func changeToClusterVersion(v string) (*semver.Version, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convertToClusterVersion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
etcdserver/v3_server.go
Outdated
return nil, err | ||
} | ||
|
||
// do linearized read to avoid using stale downgrade information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (on code comment): this is not actually doing linearizable read. It gets leaders commit index and wait for local store to finish applying that index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is clearer. Changed.
etcdserver/v3_server.go
Outdated
} | ||
resp.Version = cv.String() | ||
|
||
allowedTargetVersion := semver.Version{Major: cv.Major, Minor: cv.Minor - 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future if cluster version is 4.0, does this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the cluster version is 4.0, I think the downgrade target should be 3.x which is the latest v3 minor version according to downgrade policy. Since we cannot predict what the 3.x could be, I don't handle this case. I encapsulate this line into a function called allowedDowngradeVersion
and add a todo there. It deserve an update before we bump to v4.0. WDYT?
// validate downgrade capability before starting downgrade | ||
v := r.Version | ||
if resp, err := s.downgradeValidate(ctx, v); err != nil { | ||
return resp, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log when downgrade request is denied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added warning level log.
cc31878
to
83f404f
Compare
@YoyinZyc Sorry for delay. Will take a look. |
etcdserver/errors.go
Outdated
ErrKeyNotFound = errors.New("etcdserver: key not found") | ||
ErrCorrupt = errors.New("etcdserver: corrupt cluster") | ||
ErrBadLeaderTransferee = errors.New("etcdserver: bad leader transferee") | ||
ErrClusterVersionUnavailable = errors.New("etcdserver: cluster version is unavailable") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update all new error message with "downgrade"; "etcdserver: cluster version is unavailable"
is not clear, which cluster version it refers to. It can be something like "etcdserver: target downgrade cluster version is unavailable"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "etcdserver: cluster version not found during downgrade
etcdserver/errors.go
Outdated
ErrWrongVersionFormat = errors.New("etcdserver: wrong version format") | ||
ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version") | ||
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job") | ||
ErrIsNotDowngrading = errors.New("etcdserver: cluster is not downgrading") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name this ErrNoInflightDowngrade
?
etcdserver/errors.go
Outdated
ErrClusterVersionUnavailable = errors.New("etcdserver: cluster version is unavailable") | ||
ErrWrongVersionFormat = errors.New("etcdserver: wrong version format") | ||
ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version") | ||
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name this ErrDowngradeInProgress
?
etcdserver/v3_server.go
Outdated
downgradeInfo := s.cluster.DowngradeInfo() | ||
if downgradeInfo.Enabled { | ||
// Todo: return the downgrade status along with the error msg | ||
return resp, ErrIsDowngrading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, ...
?
etcdserver/v3_server.go
Outdated
allowedTargetVersion := allowedDowngradeVersion(cv) | ||
if !targetVersion.Equal(*allowedTargetVersion) { | ||
err = ErrInvalidDowngradeTargetVersion | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just return nil, ErrInvalidDowngradeTargetVersion
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great.
I've requested some minor changes in error messages.
Thanks!
83f404f
to
d230e6b
Compare
Fixed all your comments. PHAL again. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thx!
3rd step of #11716
This PR added the implementation of downgrade validate, enable and cancel in v3_server.go and apply.go.
Server will receive downgrade requests in
v3_server.go
and then send out internal raft requestDowngradeInfoSetRequest
. Finally, the downgrade info update will be applied.The required tests are all passed.