Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Etcd downgrade] Implement downgrade validate, enable and cancel #11801

Merged
merged 2 commits into from
May 18, 2020

Conversation

YoyinZyc
Copy link
Contributor

3rd step of #11716
This PR added the implementation of downgrade validate, enable and cancel in v3_server.go and apply.go.
Server will receive downgrade requests in v3_server.go and then send out internal raft request DowngradeInfoSetRequest. Finally, the downgrade info update will be applied.

The required tests are all passed.
image

@YoyinZyc YoyinZyc mentioned this pull request Apr 21, 2020
13 tasks
@YoyinZyc
Copy link
Contributor Author

@gyuho @jingyih PHAL. Thanks!

@codecov-io
Copy link

Codecov Report

Merging #11801 into master will decrease coverage by 0.40%.
The diff coverage is 11.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #11801      +/-   ##
==========================================
- Coverage   66.23%   65.82%   -0.41%     
==========================================
  Files         403      403              
  Lines       36950    37018      +68     
==========================================
- Hits        24473    24368     -105     
- Misses      10991    11160     +169     
- Partials     1486     1490       +4     
Impacted Files Coverage Δ
etcdserver/api/v3rpc/rpctypes/error.go 90.47% <ø> (ø)
etcdserver/api/v3rpc/util.go 67.74% <ø> (ø)
etcdserver/apply.go 79.18% <0.00%> (-0.85%) ⬇️
etcdserver/errors.go 0.00% <ø> (ø)
etcdserver/v3_server.go 67.49% <0.00%> (-7.19%) ⬇️
etcdserver/util.go 98.91% <100.00%> (+0.10%) ⬆️
client/client.go 42.81% <0.00%> (-35.63%) ⬇️
pkg/testutil/recorder.go 77.77% <0.00%> (-3.71%) ⬇️
clientv3/retry_interceptor.go 68.62% <0.00%> (-0.99%) ⬇️
auth/store.go 52.94% <0.00%> (-0.71%) ⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8d52be5...cc31878. Read the comment docs.

Copy link
Contributor

@jingyih jingyih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Yuchen! I added few comments.

ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version")
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job")
ErrIsNotDowngrading = errors.New("etcdserver: cluster is not downgrading")
ErrUnknownDowngradeAction = errors.New("etcdserver: unknown downgrade action")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this basically ErrUnknownMethod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be. I changed to use ErrUnknownMethod.

@@ -171,3 +172,17 @@ func warnOfExpensiveGenericRequest(lg *zap.Logger, now time.Time, reqStringer fm
func isNil(msg proto.Message) bool {
return msg == nil || reflect.ValueOf(msg).IsNil()
}

func changeToClusterVersion(v string) (*semver.Version, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertToClusterVersion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return nil, err
}

// do linearized read to avoid using stale downgrade information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (on code comment): this is not actually doing linearizable read. It gets leaders commit index and wait for local store to finish applying that index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is clearer. Changed.

}
resp.Version = cv.String()

allowedTargetVersion := semver.Version{Major: cv.Major, Minor: cv.Minor - 1}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future if cluster version is 4.0, does this work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cluster version is 4.0, I think the downgrade target should be 3.x which is the latest v3 minor version according to downgrade policy. Since we cannot predict what the 3.x could be, I don't handle this case. I encapsulate this line into a function called allowedDowngradeVersion and add a todo there. It deserve an update before we bump to v4.0. WDYT?

// validate downgrade capability before starting downgrade
v := r.Version
if resp, err := s.downgradeValidate(ctx, v); err != nil {
return resp, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log when downgrade request is denied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added warning level log.

@YoyinZyc YoyinZyc requested review from gyuho and jingyih May 4, 2020 16:44
@gyuho
Copy link
Contributor

gyuho commented May 12, 2020

@YoyinZyc Sorry for delay. Will take a look.

@gyuho gyuho self-assigned this May 12, 2020
ErrKeyNotFound = errors.New("etcdserver: key not found")
ErrCorrupt = errors.New("etcdserver: corrupt cluster")
ErrBadLeaderTransferee = errors.New("etcdserver: bad leader transferee")
ErrClusterVersionUnavailable = errors.New("etcdserver: cluster version is unavailable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update all new error message with "downgrade"; "etcdserver: cluster version is unavailable" is not clear, which cluster version it refers to. It can be something like "etcdserver: target downgrade cluster version is unavailable"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or "etcdserver: cluster version not found during downgrade

ErrWrongVersionFormat = errors.New("etcdserver: wrong version format")
ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version")
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job")
ErrIsNotDowngrading = errors.New("etcdserver: cluster is not downgrading")
Copy link
Contributor

@gyuho gyuho May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name this ErrNoInflightDowngrade?

ErrClusterVersionUnavailable = errors.New("etcdserver: cluster version is unavailable")
ErrWrongVersionFormat = errors.New("etcdserver: wrong version format")
ErrInvalidDowngradeTargetVersion = errors.New("etcdserver: invalid target version")
ErrIsDowngrading = errors.New("etcdserver: cluster has an ongoing downgrade job")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name this ErrDowngradeInProgress?

downgradeInfo := s.cluster.DowngradeInfo()
if downgradeInfo.Enabled {
// Todo: return the downgrade status along with the error msg
return resp, ErrIsDowngrading
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return nil, ...?

allowedTargetVersion := allowedDowngradeVersion(cv)
if !targetVersion.Equal(*allowedTargetVersion) {
err = ErrInvalidDowngradeTargetVersion
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just return nil, ErrInvalidDowngradeTargetVersion?

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great.

I've requested some minor changes in error messages.

Thanks!

@YoyinZyc YoyinZyc force-pushed the downgrade-server branch from 83f404f to d230e6b Compare May 15, 2020 21:46
@YoyinZyc
Copy link
Contributor Author

Overall looks great.

I've requested some minor changes in error messages.

Thanks!

Fixed all your comments. PHAL again. Thanks!

@YoyinZyc YoyinZyc requested a review from gyuho May 18, 2020 05:01
Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thx!

@gyuho gyuho merged commit 5e2815e into etcd-io:master May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants