-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(no)StoreV2 (Part 2): Prepare to read membership information from backend #12820
Conversation
4843d3e
to
d3fa68f
Compare
962d0e3
to
e9f6cd6
Compare
Codecov Report
@@ Coverage Diff @@
## master #12820 +/- ##
==========================================
- Coverage 66.43% 65.57% -0.86%
==========================================
Files 410 424 +14
Lines 32739 33226 +487
==========================================
+ Hits 21749 21787 +38
- Misses 9081 9353 +272
- Partials 1909 2086 +177
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
e9f6cd6
to
c61c97d
Compare
Let's be careful about this. IIRC, etcdserver does not use consistent index to guard the apply of config change entries. etcd/server/etcdserver/server.go Line 2028 in e24e72c
If we switch to restore membership from v3 backend, this means we could potentially re-apply the config changes? |
Good catch that we don't have the same shouldApplyV3 guard that we do for regular entries: etcd/server/etcdserver/server.go Line 2060 in e24e72c
Taking in consideration that StoreV2 membership information is written in exact same place where Backend is updated: etcd/server/etcdserver/api/membership/cluster.go Lines 376 to 381 in e24e72c
If there is a problem, we would have it already. Maybe membership edit operations are re-entrant and for that reason its working ? |
V2 store is not as persistent as the V3 backend. V2 store is in memory. Upon restart, it is initially recovered from the last snapshot (and then raft will replay the WAL entries since the last snapshot), whereas initially V3 backend has all the data up to the applied index. |
8e3393d
to
8343d16
Compare
…lied on v2store ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are writing both to V2store and backend. Prior this CL there were in a branch not executed if shouldApplyV3 was false, e.g. during restore when Backend is up-to-date (has high consistency-index) while v2store requires replay from WAL log. The most serious consequence of this bug was that v2store after restore could have different index (revision) than the same exact store before restore, so potentially different content between replicas. Also this change is supressing double-applying of Membership (ClusterConfig) changes on Backend (store v3) - that lackilly are not part of MVCC/KeyValue store, so they didn't caused Revisions to be bumped. Inspired by jingyih@ comment: etcd-io#12820 (comment)
8343d16
to
486a3c1
Compare
@jingyih: Thank you for the comments. Based on this I discovered more serious issue: ClusterVersion information was not saved to v2store when restoring with 'recent consistency_index'.
|
486a3c1
to
16e6aec
Compare
…lied on v2store ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are writing both to V2store and backend. Prior this CL there were in a branch not executed if shouldApplyV3 was false, e.g. during restore when Backend is up-to-date (has high consistency-index) while v2store requires replay from WAL log. The most serious consequence of this bug was that v2store after restore could have different index (revision) than the same exact store before restore, so potentially different content between replicas. Also this change is supressing double-applying of Membership (ClusterConfig) changes on Backend (store v3) - that lackilly are not part of MVCC/KeyValue store, so they didn't caused Revisions to be bumped. Inspired by jingyih@ comment: etcd-io#12820 (comment)
@jingyih @hexfusion: Please take a look. |
be := backend.NewDefaultBackend(dbpath) | ||
defer be.Close() | ||
|
||
ci := cindex.NewConsistentIndex(be.BatchTx()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is setting consistent index removed from this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. Trying to develop a good test that would catch such inconsistency...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jingyih: PTAL. I fixed this problem and added a test that would detected it. See server/verify/verify.go
.
e1ffa9a
to
a18e995
Compare
correct 'backend' (bbolt) context in aspect of membership. Prior to this change the 'restored' backend used to still contain: - old memberid (mvcc deletion used, why the membership is in bolt bucket, but not mvcc part): ``` mvs := mvcc.NewStore(s.lg, be, lessor, ci, mvcc.StoreConfig{CompactionBatchLimit: math.MaxInt32}) defer mvs.Close() txn := mvs.Write(traceutil.TODO()) btx := be.BatchTx() del := func(k, v []byte) error { txn.DeleteRange(k, nil) return nil } // delete stored members from old cluster since using new members btx.UnsafeForEach([]byte("members"), del) ``` - didn't get new members added.
a18e995
to
ac295e5
Compare
dd73aba
to
befaf41
Compare
|
||
// TrimMembershipFromBackend removes all information about members & | ||
// removed_members from the v3 backend. | ||
func TrimMembershipFromBackend(lg *zap.Logger, be backend.Backend) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log some info and errors here? lg
is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
9ef142c
to
0675219
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Preparation work to read membership information from backend instead of v2store.
This change fixes:
etcdctl snapshot restore
that was not properly updating membership information in backend.This change does not impacts behavior of etcd binary itself, and the membership information is still read for v2store.
Separate PR that changes this will follow.