Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] fix revision loss issue caused by compaction - 17780 #17815

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented Apr 17, 2024

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@fuweid fuweid marked this pull request as ready for review April 17, 2024 14:18
@fuweid
Copy link
Member Author

fuweid commented Apr 17, 2024

Updated that case to review.

Assume that compacted main revision is 201 and the 199/200/201 are tomestone. Before set compacted finished, the etcd has deleted 199/200/201 revision. And then it panics. After restart, the max revision is 198.

The 199/200/201 requests are using TXN to based on previous revision. Even if we replay WAL, these requests still failed. WAL won't help here because the base, bbolt, isn't align with WAL.

cc @ahrtr @serathius @siyuanfoundation @chaochn47

@ahrtr
Copy link
Member

ahrtr commented Apr 17, 2024

Thanks @fuweid

Confirmed that this is a real issue. My previous comment #17780 (comment) isn't correct. The root cause is that Compaction will remove the last revision if it's a tombstone revision.

One solution I can think of is to add a protection: Never compact the latest revision, no matter it's a tombstone or not.

diff --git a/server/storage/mvcc/kvstore_compaction.go b/server/storage/mvcc/kvstore_compaction.go
index 45409114e..351a4efb1 100644
--- a/server/storage/mvcc/kvstore_compaction.go
+++ b/server/storage/mvcc/kvstore_compaction.go
@@ -54,7 +54,7 @@ func (s *store) scheduleCompaction(compactMainRev, prevCompactRev int64) (KeyVal
                keys, values := tx.UnsafeRange(schema.Key, last, end, int64(batchNum))
                for i := range keys {
                        rev = BytesToRev(keys[i])
-                       if _, ok := keep[rev]; !ok {
+                       if _, ok := keep[rev]; !ok && rev.Main != s.currentRev {
                                tx.UnsafeDelete(schema.Key, keys[i])
                                keyCompactions++
                        }

@serathius
Copy link
Member

Awesome finding @fuweid, can you confirm if the issue affects v3.4 or v3.5?

@serathius serathius added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release/v3.4 labels Apr 17, 2024
@serathius
Copy link
Member

Reproduced it on release-3.4 so I expect it's also on 3.5.

tests/e2e/reproduce_17780_test.go Outdated Show resolved Hide resolved
tests/e2e/reproduce_17780_test.go Show resolved Hide resolved
@fuweid
Copy link
Member Author

fuweid commented Apr 17, 2024

Reproduced it on release-3.4 so I expect it's also on 3.5.

For release-3.4, I add this patch and we can reproduce it in 3.4

Click!
commit 366110808a969e2ace1ee7575868ed916b4f0af6 (HEAD -> with-compactBeforeSetFinishedCompact-34)
Author: Wei Fu <fuweid89@gmail.com>
Date:   Wed Apr 17 23:47:30 2024 +0800

    --wip-- [skip ci]

    Signed-off-by: Wei Fu <fuweid89@gmail.com>

diff --git a/mvcc/kvstore_compaction.go b/mvcc/kvstore_compaction.go
index 963ebe950..10035e897 100644
--- a/mvcc/kvstore_compaction.go
+++ b/mvcc/kvstore_compaction.go
@@ -49,6 +49,7 @@ func (s *store) scheduleCompaction(compactMainRev int64, keep map[revision]struc
                }

                if len(keys) < s.cfg.CompactionBatchLimit {
+                       // gofail: var compactBeforeSetFinishedCompact struct{}
                        rbytes := make([]byte, 8+1+8)
                        revToBytes(revision{main: compactMainRev}, rbytes)
                        tx.UnsafePut(metaBucketName, finishedCompactKeyName, rbytes)
/tmp/17780 -test.v -test.run TestReproduce17780
=== RUN   TestReproduce17780
    before.go:36: Changing working directory to: /tmp/TestReproduce177801122229632/001
    logger.go:146: 2024-04-17T23:56:21.409+0800 INFO    starting server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:56:21.409+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd", "--name=TestReproduce17780-test-0", "--listen-client-urls=http://localhost:20000", "--advertise-client-urls=http://localhost:20000", "--listen-peer-urls=http://localhost:20001", "--initial-advertise-peer-urls=http://localhost:20001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestReproduce177801122229632/002", "--snapshot-count=1000", "--experimental-compaction-batch-limit=100", "--experimental-watch-progress-notify-interval=100ms", "--initial-cluster-token=new", "--snapshot-count=1000", "--initial-cluster=TestReproduce17780-test-0=http://localhost:20001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestReproduce177801122229632/001", "name": "TestReproduce17780-test-0", "environment-variables": ["GOFAIL_HTTP=127.0.0.1:12381", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "EXPECT_DEBUG=true", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): WARNING: Package "github.com/golang/protobuf/protoc-gen-go/generator" is deprecated.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      A future release of golang/protobuf will delete this package,
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      which has long been excluded from the compatibility promise.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425693 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=amd64
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425754 W | pkg/flags: unrecognized environment variable ETCD_VERIFY=all
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425776 W | embed: Running http and grpc server on single port. This is not recommended for production.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425796 I | etcdmain: etcd Version: 3.4.31
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425799 I | etcdmain: Git SHA: 366110808-FAILPOINTS
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425802 I | etcdmain: Go Version: go1.22.1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425804 I | etcdmain: Go OS/Arch: linux/amd64
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425822 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.425892 W | embed: Running http and grpc server on single port. This is not recommended for production.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426226 I | embed: name = TestReproduce17780-test-0
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426293 I | embed: data dir = /tmp/TestReproduce177801122229632/002
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426310 I | embed: member dir = /tmp/TestReproduce177801122229632/002/member
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426313 I | embed: heartbeat = 100ms
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426316 I | embed: election = 1000ms
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426318 I | embed: snapshot count = 1000
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.426324 I | embed: advertise client URLs = http://localhost:20000
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.437774 I | etcdserver: starting member ca50e9357181d758 in cluster 34f27e83b3bc2ff
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: ca50e9357181d758 switched to configuration voters=()
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: ca50e9357181d758 became follower at term 0
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: newRaft ca50e9357181d758 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: ca50e9357181d758 became follower at term 1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: ca50e9357181d758 switched to configuration voters=(14578408409545168728)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.443107 W | auth: simple token is not cryptographically signed
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.447076 I | etcdserver: starting server... [version: 3.4.31, cluster version: to_be_decided]
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.447299 I | etcdserver: ca50e9357181d758 as single-node; fast-forwarding 9 ticks (election ticks 10)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.447373 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/snap, suffix: snap.db, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.447404 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/snap, suffix: snap, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.447413 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/wal, suffix: wal, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.448894 I | embed: listening for peers on 127.0.0.1:20001
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:21 INFO: ca50e9357181d758 switched to configuration voters=(14578408409545168728)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:21.449677 I | etcdserver/membership: added member ca50e9357181d758 [http://localhost:20001] to cluster 34f27e83b3bc2ff
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 is starting a new election at term 1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 became candidate at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 received MsgVoteResp from ca50e9357181d758 at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 became leader at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): raft2024/04/17 23:56:22 INFO: raft.node: ca50e9357181d758 elected leader ca50e9357181d758 at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.141822 I | etcdserver: published {Name:TestReproduce17780-test-0 ClientURLs:[http://localhost:20000]} to cluster 34f27e83b3bc2ff
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.141959 I | embed: ready to serve client requests
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.142177 I | etcdserver: setting up the initial cluster version to 3.4
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.143454 N | embed: serving insecure client requests on 127.0.0.1:20000, this is strongly discouraged!
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.143564 N | etcdserver/membership: set the initial cluster version to 3.4
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.143635 I | etcdserver/api: enabled capabilities for version 3.4
    logger.go:146: 2024-04-17T23:56:22.143+0800 INFO    started server. {"name": "TestReproduce17780-test-0", "pid": 312607}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): 2024-04-17 23:56:22.322070 I | mvcc: store.index: compact 201
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): panic: failpoint panic: {}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): goroutine 179 [running]:
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/gofail/runtime.actPanic(0x1010100052200?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/go/pkg/mod/go.etcd.io/gofail@v0.1.0/runtime/terms.go:318 +0x65
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/gofail/runtime.(*term).do(...)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/go/pkg/mod/go.etcd.io/gofail@v0.1.0/runtime/terms.go:290
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/gofail/runtime.(*terms).eval(0xc0000c9628?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/go/pkg/mod/go.etcd.io/gofail@v0.1.0/runtime/terms.go:105 +0xe3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/gofail/runtime.(*Failpoint).Acquire(0xc0000c9628?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/go/pkg/mod/go.etcd.io/gofail@v0.1.0/runtime/failpoint.go:38 +0x98
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/etcd/mvcc.(*store).scheduleCompaction(0xc000292690, 0xc9, 0xc00060c570)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/workspace/etcd/mvcc/kvstore_compaction.go:52 +0x4c5
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/etcd/mvcc.(*store).compact.func1({0x11a8f38, 0xc000130190})
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/workspace/etcd/mvcc/kvstore.go:289 +0xe7
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): go.etcd.io/etcd/pkg/schedule.(*fifo).run(0xc0003324e0)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/workspace/etcd/pkg/schedule/schedule.go:157 +0x105
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607): created by go.etcd.io/etcd/pkg/schedule.NewFIFOScheduler in goroutine 1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312607):      /home/fuweid/workspace/etcd/pkg/schedule/schedule.go:70 +0x156
{"level":"warn","ts":"2024-04-17T23:56:22.350004+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001fa000/localhost:20000","method":"/etcdserverpb.KV/Compact","attempt":0,"error":"rpc error: code = Unavailable desc = error reading from server: EOF"}
    logger.go:146: 2024-04-17T23:56:22.350+0800 INFO    restarting server...    {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:56:22.350+0800 INFO    stopping server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:56:22.350+0800 INFO    stopped server. {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:56:22.350+0800 INFO    starting server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:56:22.350+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd", "--name=TestReproduce17780-test-0", "--listen-client-urls=http://localhost:20000", "--advertise-client-urls=http://localhost:20000", "--listen-peer-urls=http://localhost:20001", "--initial-advertise-peer-urls=http://localhost:20001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestReproduce177801122229632/002", "--snapshot-count=1000", "--experimental-compaction-batch-limit=100", "--experimental-watch-progress-notify-interval=100ms", "--initial-cluster-token=new", "--snapshot-count=1000", "--initial-cluster=TestReproduce17780-test-0=http://localhost:20001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestReproduce177801122229632/001", "name": "TestReproduce17780-test-0", "environment-variables": ["GOFAIL_HTTP=127.0.0.1:12381", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "EXPECT_DEBUG=true", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): WARNING: Package "github.com/golang/protobuf/protoc-gen-go/generator" is deprecated.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622):      A future release of golang/protobuf will delete this package,
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622):      which has long been excluded from the compatibility promise.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622):
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363408 W | pkg/flags: unrecognized environment variable ETCD_UNSUPPORTED_ARCH=amd64
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363457 W | pkg/flags: unrecognized environment variable ETCD_VERIFY=all
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363472 W | embed: Running http and grpc server on single port. This is not recommended for production.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363475 I | etcdmain: etcd Version: 3.4.31
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363477 I | etcdmain: Git SHA: 366110808-FAILPOINTS
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363478 I | etcdmain: Go Version: go1.22.1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363480 I | etcdmain: Go OS/Arch: linux/amd64
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363481 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363516 N | etcdmain: the server is already initialized as member before, starting as etcd member...
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): [WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363541 W | embed: Running http and grpc server on single port. This is not recommended for production.
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363770 I | embed: name = TestReproduce17780-test-0
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363776 I | embed: data dir = /tmp/TestReproduce177801122229632/002
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363778 I | embed: member dir = /tmp/TestReproduce177801122229632/002/member
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363780 I | embed: heartbeat = 100ms
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363782 I | embed: election = 1000ms
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363785 I | embed: snapshot count = 1000
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363797 I | embed: advertise client URLs = http://localhost:20000
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363816 I | embed: initial advertise peer URLs = http://localhost:20001
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.363821 I | embed: initial cluster =
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.369763 I | etcdserver: restarting member ca50e9357181d758 in cluster 34f27e83b3bc2ff at commit index 205
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 switched to configuration voters=()
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 became follower at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:22 INFO: newRaft ca50e9357181d758 [peers: [], term: 2, commit: 205, applied: 0, lastindex: 205, lastterm: 2]
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.370883 W | auth: simple token is not cryptographically signed
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.374195 I | mvcc: resume scheduled compaction at 201
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.374969 I | etcdserver: starting server... [version: 3.4.31, cluster version: to_be_decided]
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375050 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/snap, suffix: snap.db, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375073 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/snap, suffix: snap, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375091 I | pkg/fileutil: started to purge file, dir: /tmp/TestReproduce177801122229632/002/member/wal, suffix: wal, max: 5, interval: 30s
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:22 INFO: ca50e9357181d758 switched to configuration voters=(14578408409545168728)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375360 I | etcdserver/membership: added member ca50e9357181d758 [http://localhost:20001] to cluster 34f27e83b3bc2ff
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375429 N | etcdserver/membership: set the initial cluster version to 3.4
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.375457 I | etcdserver/api: enabled capabilities for version 3.4
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.376267 W | etcdserver: failed to apply request "compaction:<revision:201 physical:true > header:<ID:15517309663689981645 > " with response "" took (3.312µs) to execute, err is mvcc: required revision is a future revision
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:22.376345 I | embed: listening for peers on 127.0.0.1:20001
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:23 INFO: ca50e9357181d758 is starting a new election at term 2
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:23 INFO: ca50e9357181d758 became candidate at term 3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:23 INFO: ca50e9357181d758 received MsgVoteResp from ca50e9357181d758 at term 3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:23 INFO: ca50e9357181d758 became leader at term 3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): raft2024/04/17 23:56:23 INFO: raft.node: ca50e9357181d758 elected leader ca50e9357181d758 at term 3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.376415 I | etcdserver: published {Name:TestReproduce17780-test-0 ClientURLs:[http://localhost:20000]} to cluster 34f27e83b3bc2ff
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.376505 I | embed: ready to serve client requests
    logger.go:146: 2024-04-17T23:56:23.376+0800 INFO    started server. {"name": "TestReproduce17780-test-0", "pid": 312622}
    logger.go:146: 2024-04-17T23:56:23.376+0800 INFO    restarted server        {"name": "TestReproduce17780-test-0"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.377538 N | embed: serving insecure client requests on 127.0.0.1:20000, this is strongly discouraged!
    reproduce_17780_test.go:84:
                Error Trace:    /home/fuweid/workspace/etcd/tests/e2e/reproduce_17780_test.go:84
                Error:          "198" is not greater than or equal to "201"
                Test:           TestReproduce17780
    logger.go:146: 2024-04-17T23:56:23.379+0800 INFO    stopping server...      {"name": "TestReproduce17780-test-0"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.380180 N | pkg/osutil: received terminated signal, shutting down...
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.380339 W | embed: stopping insecure grpc server due to error: accept tcp 127.0.0.1:20000: use of closed network connection
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.380398 W | embed: stopped insecure grpc server due to error: accept tcp 127.0.0.1:20000: use of closed network connection
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (312622): 2024-04-17 23:56:23.380425 I | etcdserver: skipped leadership transfer for single voting member cluster
    logger.go:146: 2024-04-17T23:56:23.387+0800 INFO    stopped server. {"name": "TestReproduce17780-test-0"}
--- FAIL: TestReproduce17780 (1.98s)
FAIL

For release-3.5, I can reproduce it as well.

Click! ```diff commit 9d571ff1a76597f515da12dad17b9d69cf2ddb04 (HEAD -> with-compactBeforeSetFinishedCompact-35) Author: Wei Fu Date: Wed Apr 17 23:53:24 2024 +0800
--wip-- [skip ci]

diff --git a/server/mvcc/kvstore_compaction.go b/server/mvcc/kvstore_compaction.go
index c7d343d5c..89defbd9e 100644
--- a/server/mvcc/kvstore_compaction.go
+++ b/server/mvcc/kvstore_compaction.go
@@ -59,6 +59,7 @@ func (s *store) scheduleCompaction(compactMainRev, prevCompactRev int64) (KeyVal
}

            if len(keys) < s.cfg.CompactionBatchLimit {
  •                   // gofail: var compactBeforeSetFinishedCompact struct{}
                      rbytes := make([]byte, 8+1+8)
                      revToBytes(revision{main: compactMainRev}, rbytes)
                      tx.UnsafePut(buckets.Meta, finishedCompactKeyName, rbytes)
    

```bash
➜  e2e git:(with-compactBeforeSetFinishedCompact-35) ✗ /tmp/17780 -test.v -test.run TestReproduce17780
=== RUN   TestReproduce17780
    before.go:36: Changing working directory to: /tmp/TestReproduce17780628008084/001
    logger.go:146: 2024-04-17T23:55:05.075+0800 INFO    starting server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:55:05.075+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd", "--name=TestReproduce17780-test-0", "--listen-client-urls=http://localhost:20000", "--advertise-client-urls=http://localhost:20000", "--listen-peer-urls=http://localhost:20001", "--initial-advertise-peer-urls=http://localhost:20001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestReproduce17780628008084/002", "--snapshot-count=1000", "--experimental-compaction-batch-limit=100", "--snapshot-count=1000", "--initial-cluster-token=new", "--experimental-watch-progress-notify-interval=100ms", "--initial-cluster=TestReproduce17780-test-0=http://localhost:20001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestReproduce17780628008084/001", "name": "TestReproduce17780-test-0", "environment-variables": ["GOFAIL_HTTP=127.0.0.1:12381", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "EXPECT_DEBUG=true", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"warn","ts":"2024-04-17T23:55:05.097899+0800","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_UNSUPPORTED_ARCH=amd64"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"warn","ts":"2024-04-17T23:55:05.097972+0800","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_VERIFY=all"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"warn","ts":"2024-04-17T23:55:05.098001+0800","caller":"embed/config.go:679","msg":"Running http and grpc server on single port. This is not recommended for production."}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.098026+0800","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/home/fuweid/workspace/etcd/bin/etcd","--name=TestReproduce17780-test-0","--listen-client-urls=http://localhost:20000","--advertise-client-urls=http://localhost:20000","--listen-peer-urls=http://localhost:20001","--initial-advertise-peer-urls=http://localhost:20001","--initial-cluster-token=new","--data-dir","/tmp/TestReproduce17780628008084/002","--snapshot-count=1000","--experimental-compaction-batch-limit=100","--snapshot-count=1000","--initial-cluster-token=new","--experimental-watch-progress-notify-interval=100ms","--initial-cluster=TestReproduce17780-test-0=http://localhost:20001","--initial-cluster-state=new"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"warn","ts":"2024-04-17T23:55:05.098085+0800","caller":"embed/config.go:679","msg":"Running http and grpc server on single port. This is not recommended for production."}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.098104+0800","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:20001"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.098421+0800","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["http://localhost:20000"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.098648+0800","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.13","git-sha":"9d571ff1a-FAILPOINTS","go-version":"go1.22.1","go-os":"linux","go-arch":"amd64","max-cpu-set":8,"max-cpu-available":8,"member-initialized":false,"name":"TestReproduce17780-test-0","data-dir":"/tmp/TestReproduce17780628008084/002","wal-dir":"","wal-dir-dedicated":"","member-dir":"/tmp/TestReproduce17780628008084/002/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":1000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:20001"],"listen-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"],"listen-client-urls":["http://localhost:20000"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"TestReproduce17780-test-0=http://localhost:20001","initial-cluster-state":"new","initial-cluster-token":"new","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.101111+0800","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/tmp/TestReproduce17780628008084/002/member/snap/db","took":"1.858641ms"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.104436+0800","caller":"etcdserver/raft.go:495","msg":"starting local member","local-member-id":"ca50e9357181d758","cluster-id":"34f27e83b3bc2ff"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.104501+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 switched to configuration voters=()"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.104533+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became follower at term 0"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.104564+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft ca50e9357181d758 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.10457+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became follower at term 1"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.104591+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 switched to configuration voters=(14578408409545168728)"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"warn","ts":"2024-04-17T23:55:05.108249+0800","caller":"auth/store.go:1241","msg":"simple token is not cryptographically signed"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.110604+0800","caller":"mvcc/kvstore.go:407","msg":"kvstore restored","current-rev":1}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.112006+0800","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.113313+0800","caller":"etcdserver/server.go:867","msg":"starting etcd server","local-member-id":"ca50e9357181d758","local-server-version":"3.5.13","cluster-version":"to_be_decided"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.113431+0800","caller":"etcdserver/server.go:751","msg":"started as single-node; fast-forwarding election ticks","local-member-id":"ca50e9357181d758","forward-ticks":9,"forward-duration":"900ms","election-ticks":10,"election-timeout":"1s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.11352+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.113652+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/snap","suffix":"snap","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.113662+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/wal","suffix":"wal","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.114298+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 switched to configuration voters=(14578408409545168728)"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.11439+0800","caller":"embed/etcd.go:277","msg":"now serving peer/client/metrics","local-member-id":"ca50e9357181d758","initial-advertise-peer-urls":["http://localhost:20001"],"listen-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"],"listen-client-urls":["http://localhost:20000"],"listen-metrics-urls":[]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.11442+0800","caller":"embed/etcd.go:597","msg":"serving peer traffic","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.114432+0800","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"34f27e83b3bc2ff","local-member-id":"ca50e9357181d758","added-peer-id":"ca50e9357181d758","added-peer-peer-urls":["http://localhost:20001"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.114448+0800","caller":"embed/etcd.go:569","msg":"cmux::serve","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405359+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 is starting a new election at term 1"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405484+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became pre-candidate at term 1"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405531+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 received MsgPreVoteResp from ca50e9357181d758 at term 1"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405562+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became candidate at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405575+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 received MsgVoteResp from ca50e9357181d758 at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405591+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became leader at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.405647+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: ca50e9357181d758 elected leader ca50e9357181d758 at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.408244+0800","caller":"etcdserver/server.go:2620","msg":"setting up initial cluster version using v2 API","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.409613+0800","caller":"etcdserver/server.go:2110","msg":"published local member to cluster through raft","local-member-id":"ca50e9357181d758","local-member-attributes":"{Name:TestReproduce17780-test-0 ClientURLs:[http://localhost:20000]}","request-path":"/0/members/ca50e9357181d758/attributes","cluster-id":"34f27e83b3bc2ff","publish-timeout":"7s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.40964+0800","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.410041+0800","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.410106+0800","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.410204+0800","caller":"membership/cluster.go:584","msg":"set initial cluster version","cluster-id":"34f27e83b3bc2ff","local-member-id":"ca50e9357181d758","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.410499+0800","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.410527+0800","caller":"etcdserver/server.go:2644","msg":"cluster version is updated","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.411014+0800","caller":"embed/serve.go:187","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"127.0.0.1:20000"}
    logger.go:146: 2024-04-17T23:55:05.412+0800 INFO    started server. {"name": "TestReproduce17780-test-0", "pid": 311480}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): {"level":"info","ts":"2024-04-17T23:55:05.60707+0800","caller":"mvcc/index.go:214","msg":"compact tree index","revision":201}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): panic: failpoint panic: {}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): goroutine 149 [running]:
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/gofail/runtime.actPanic(0x10100000001?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/gofail@v0.1.0/runtime/terms.go:318 +0x65
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/gofail/runtime.(*term).do(...)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/gofail@v0.1.0/runtime/terms.go:290
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/gofail/runtime.(*terms).eval(0xb25d65?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/gofail@v0.1.0/runtime/terms.go:105 +0xe3
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/gofail/runtime.(*Failpoint).Acquire(0x64?)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/gofail@v0.1.0/runtime/failpoint.go:38 +0x98
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/etcd/server/v3/mvcc.(*store).scheduleCompaction(0xc00011bba0, 0xc9, 0xffffffffffffffff)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/etcd/server/v3/mvcc/kvstore_compaction.go:62 +0x6ed
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/etcd/server/v3/mvcc.(*store).compact.func1({0x12b99c8, 0xc000142640})
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/etcd/server/v3/mvcc/kvstore.go:247 +0x85
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): go.etcd.io/etcd/pkg/v3/schedule.(*fifo).run(0xc000114c00)
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/etcd/pkg/v3@v3.5.13/schedule/schedule.go:157 +0x105
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480): created by go.etcd.io/etcd/pkg/v3/schedule.NewFIFOScheduler in goroutine 1
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311480):      go.etcd.io/etcd/pkg/v3@v3.5.13/schedule/schedule.go:70 +0x156
{"level":"warn","ts":"2024-04-17T23:55:05.638818+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001fe000/localhost:20000","method":"/etcdserverpb.KV/Compact","attempt":0,"error":"rpc error: code = Unavailable desc = error reading from server: EOF"}
    logger.go:146: 2024-04-17T23:55:05.638+0800 INFO    restarting server...    {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:55:05.639+0800 INFO    stopping server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:55:05.639+0800 INFO    stopped server. {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:55:05.639+0800 INFO    starting server...      {"name": "TestReproduce17780-test-0"}
    logger.go:146: 2024-04-17T23:55:05.639+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd", "--name=TestReproduce17780-test-0", "--listen-client-urls=http://localhost:20000", "--advertise-client-urls=http://localhost:20000", "--listen-peer-urls=http://localhost:20001", "--initial-advertise-peer-urls=http://localhost:20001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestReproduce17780628008084/002", "--snapshot-count=1000", "--experimental-compaction-batch-limit=100", "--snapshot-count=1000", "--initial-cluster-token=new", "--experimental-watch-progress-notify-interval=100ms", "--initial-cluster=TestReproduce17780-test-0=http://localhost:20001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestReproduce17780628008084/001", "name": "TestReproduce17780-test-0", "environment-variables": ["GOFAIL_HTTP=127.0.0.1:12381", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "EXPECT_DEBUG=true", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.653732+0800","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_UNSUPPORTED_ARCH=amd64"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.653798+0800","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_VERIFY=all"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.653827+0800","caller":"embed/config.go:679","msg":"Running http and grpc server on single port. This is not recommended for production."}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.653856+0800","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/home/fuweid/workspace/etcd/bin/etcd","--name=TestReproduce17780-test-0","--listen-client-urls=http://localhost:20000","--advertise-client-urls=http://localhost:20000","--listen-peer-urls=http://localhost:20001","--initial-advertise-peer-urls=http://localhost:20001","--initial-cluster-token=new","--data-dir","/tmp/TestReproduce17780628008084/002","--snapshot-count=1000","--experimental-compaction-batch-limit=100","--snapshot-count=1000","--initial-cluster-token=new","--experimental-watch-progress-notify-interval=100ms","--initial-cluster=TestReproduce17780-test-0=http://localhost:20001","--initial-cluster-state=new"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.653916+0800","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/tmp/TestReproduce17780628008084/002","dir-type":"member"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.653944+0800","caller":"embed/config.go:679","msg":"Running http and grpc server on single port. This is not recommended for production."}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.653959+0800","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:20001"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.654237+0800","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["http://localhost:20000"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.654351+0800","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.13","git-sha":"9d571ff1a-FAILPOINTS","go-version":"go1.22.1","go-os":"linux","go-arch":"amd64","max-cpu-set":8,"max-cpu-available":8,"member-initialized":true,"name":"TestReproduce17780-test-0","data-dir":"/tmp/TestReproduce17780628008084/002","wal-dir":"","wal-dir-dedicated":"","member-dir":"/tmp/TestReproduce17780628008084/002/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":1000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:20001"],"listen-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"],"listen-client-urls":["http://localhost:20000"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.659835+0800","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/tmp/TestReproduce17780628008084/002/member/snap/db","took":"5.013688ms"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.660246+0800","caller":"etcdserver/server.go:532","msg":"No snapshot found. Recovering WAL from scratch!"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.660961+0800","caller":"etcdserver/raft.go:530","msg":"restarting local member","cluster-id":"34f27e83b3bc2ff","local-member-id":"ca50e9357181d758","commit-index":205}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.661034+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 switched to configuration voters=()"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.661116+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became follower at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.661128+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft ca50e9357181d758 [peers: [], term: 2, commit: 205, applied: 0, lastindex: 205, lastterm: 2]"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.662463+0800","caller":"auth/store.go:1241","msg":"simple token is not cryptographically signed"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.665303+0800","caller":"mvcc/kvstore.go:407","msg":"kvstore restored","current-rev":198}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:05.66535+0800","caller":"mvcc/kvstore.go:411","msg":"compaction encountered error","error":"mvcc: required revision is a future revision"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.665387+0800","caller":"mvcc/kvstore.go:414","msg":"resume scheduled compaction","meta-bucket-name":"meta","meta-bucket-name-key":"scheduledCompactRev","scheduled-compact-revision":201}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.666759+0800","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668+0800","caller":"etcdserver/server.go:867","msg":"starting etcd server","local-member-id":"ca50e9357181d758","local-server-version":"3.5.13","cluster-version":"to_be_decided"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668115+0800","caller":"etcdserver/server.go:767","msg":"starting initial election tick advance","election-ticks":10}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668178+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668275+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/snap","suffix":"snap","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.66829+0800","caller":"fileutil/purge.go:50","msg":"started to purge file","dir":"/tmp/TestReproduce17780628008084/002/member/wal","suffix":"wal","max":5,"interval":"30s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668339+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 switched to configuration voters=(14578408409545168728)"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668407+0800","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"34f27e83b3bc2ff","local-member-id":"ca50e9357181d758","added-peer-id":"ca50e9357181d758","added-peer-peer-urls":["http://localhost:20001"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.668478+0800","caller":"membership/cluster.go:584","msg":"set initial cluster version","cluster-id":"34f27e83b3bc2ff","local-member-id":"ca50e9357181d758","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.6685+0800","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.669161+0800","caller":"embed/etcd.go:277","msg":"now serving peer/client/metrics","local-member-id":"ca50e9357181d758","initial-advertise-peer-urls":["http://localhost:20001"],"listen-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"],"listen-client-urls":["http://localhost:20000"],"listen-metrics-urls":[]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.669203+0800","caller":"embed/etcd.go:597","msg":"serving peer traffic","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:05.669225+0800","caller":"embed/etcd.go:569","msg":"cmux::serve","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.86158+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 is starting a new election at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861736+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became pre-candidate at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861779+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 received MsgPreVoteResp from ca50e9357181d758 at term 2"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861832+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became candidate at term 3"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861845+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 received MsgVoteResp from ca50e9357181d758 at term 3"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861874+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"ca50e9357181d758 became leader at term 3"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.861888+0800","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: ca50e9357181d758 elected leader ca50e9357181d758 at term 3"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.868599+0800","caller":"etcdserver/server.go:2110","msg":"published local member to cluster through raft","local-member-id":"ca50e9357181d758","local-member-attributes":"{Name:TestReproduce17780-test-0 ClientURLs:[http://localhost:20000]}","request-path":"/0/members/ca50e9357181d758/attributes","cluster-id":"34f27e83b3bc2ff","publish-timeout":"7s"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.868629+0800","caller":"embed/serve.go:103","msg":"ready to serve client requests"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.868881+0800","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.868941+0800","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.869666+0800","caller":"embed/serve.go:187","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"127.0.0.1:20000"}
    logger.go:146: 2024-04-17T23:55:06.875+0800 INFO    started server. {"name": "TestReproduce17780-test-0", "pid": 311493}
    logger.go:146: 2024-04-17T23:55:06.875+0800 INFO    restarted server        {"name": "TestReproduce17780-test-0"}
    reproduce_17780_test.go:84:
                Error Trace:    /home/fuweid/workspace/etcd/tests/e2e/reproduce_17780_test.go:84
                Error:          "198" is not greater than or equal to "201"
                Test:           TestReproduce17780
    logger.go:146: 2024-04-17T23:55:06.877+0800 INFO    stopping server...      {"name": "TestReproduce17780-test-0"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.878169+0800","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.878223+0800","caller":"embed/etcd.go:375","msg":"closing etcd server","name":"TestReproduce17780-test-0","data-dir":"/tmp/TestReproduce17780628008084/002","advertise-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:06.878331+0800","caller":"embed/serve.go:160","msg":"stopping insecure grpc server due to error","error":"accept tcp 127.0.0.1:20000: use of closed network connection"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"warn","ts":"2024-04-17T23:55:06.878405+0800","caller":"embed/serve.go:162","msg":"stopped insecure grpc server due to error","error":"accept tcp 127.0.0.1:20000: use of closed network connection"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.878422+0800","caller":"etcdserver/server.go:1513","msg":"skipped leadership transfer for single voting member cluster","local-member-id":"ca50e9357181d758","current-leader-member-id":"ca50e9357181d758"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.882328+0800","caller":"embed/etcd.go:579","msg":"stopping serving peer traffic","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.882414+0800","caller":"embed/etcd.go:584","msg":"stopped serving peer traffic","address":"127.0.0.1:20001"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.882435+0800","caller":"embed/etcd.go:377","msg":"closed etcd server","name":"TestReproduce17780-test-0","data-dir":"/tmp/TestReproduce17780628008084/002","advertise-peer-urls":["http://localhost:20001"],"advertise-client-urls":["http://localhost:20000"]}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.882455+0800","caller":"verify/verify.go:57","msg":"verification of persisted state","data-dir":"/tmp/TestReproduce17780628008084/002"}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.883183+0800","caller":"verify/verify.go:129","msg":"verification: consistentIndex OK","backend-consistent-index":207,"hardstate-commit":207}
/home/fuweid/workspace/etcd/bin/etcd (TestReproduce17780-test-0) (311493): {"level":"info","ts":"2024-04-17T23:55:06.884253+0800","caller":"verify/verify.go:68","msg":"verification of persisted state successful","data-dir":"/tmp/TestReproduce17780628008084/002"}
    logger.go:146: 2024-04-17T23:55:06.885+0800 INFO    stopped server. {"name": "TestReproduce17780-test-0"}
--- FAIL: TestReproduce17780 (1.81s)
FAIL

resp, err := cli.Get(ctx, fmt.Sprintf("%d", 3))
require.NoError(t, err)
require.True(t, resp.Count == 1)
// The 199/200/201 revision has been deleted. The max rev is 198.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add to show that compaction has actually been persisted.

	resp, err = cli.Get(ctx, fmt.Sprintf("%d", 99))
	require.NoError(t, err)
	require.True(t, resp.Count == 0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I added it!

@siyuanfoundation
Copy link
Contributor

Great find fuwed!
So the reason this is happening is because

// keys in the range [compacted revision -N, compaction] might all be deleted due to compaction.
// the correct revision should be set to compaction revision in the case, not the largest revision
// we have seen.
if s.currentRev < s.compactMainRev {
s.currentRev = s.compactMainRev
}
, so the compactRev should be persisted in disruption. Something like
https://github.com/etcd-io/etcd/compare/main...siyuanfoundation:etcd:issue/17780?expand=1

@chaochn47
Copy link
Member

chaochn47 commented Apr 17, 2024

One solution I can think of is to add a protection: Never compact the latest revision, no matter it's a tombstone or not.

Just my 2 cents I feel we should point the s.currentRev to the scheduledCompact if s.currentRev is smaller in s.restore() after getting the physical signal that last compaction has resumed successfully.

        if scheduledCompact != 0 {
-               if _, err := s.compactLockfree(scheduledCompact); err != nil {
+               ch, err := s.compactLockfree(scheduledCompact)
+
+               if err != nil {
                        s.lg.Warn("compaction encountered error", zap.Error(err))
+                       return err
+               }
+               <-ch
+               {
+                       s.revMu.Lock()
+                       if s.currentRev < scheduledCompact {
+                               s.currentRev = scheduledCompact
+                       }
+                       s.revMu.Unlock()
                }

Edit: resuming the previous scheduledCompact would fail with mvcc: required revision is a future revision but the idea is trying to restore the right s.currentRev instead of changing the semantics of Compact

@ahrtr
Copy link
Member

ahrtr commented Apr 17, 2024

Just my 2 cents I feel we should point the s.currentRev to the scheduledCompact

Seems like another valid solution.

  • The scheduledCompactionRevision is always persisted (committed) before performing the compaction.
    s.b.ForceCommit()
  • The persisted scheduledCompactRevision should be always >= finishedCompactRevision.

@fuweid fuweid force-pushed the repro-17780 branch 3 times, most recently from 7822a88 to a59d051 Compare April 18, 2024 14:52
@fuweid fuweid changed the title tests/e2e: add new case to reproduce 17780 [RFC] fix revision loss issue caused by compaction - 17780 Apr 18, 2024
@fuweid
Copy link
Member Author

fuweid commented Apr 18, 2024

Sorry for late reply. Try to update the currentRev in lock if currentRev < scheduledCompact, based on @chaochn47 comment. And updated test case with 3 nodes. Without the fix, the last revision is not consistent in the clusters.

server/storage/mvcc/kvstore.go Outdated Show resolved Hide resolved
tests/e2e/reproduce_17780_test.go Outdated Show resolved Hide resolved
tests/e2e/reproduce_17780_test.go Outdated Show resolved Hide resolved
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Great work!

Comment on lines +54 to +107
// Revision: 2 -> 8 for new keys
n := compactionBatchLimit - 2
valueSize := 16
for i := 2; i <= n; i++ {
_, err := cli.Put(ctx, fmt.Sprintf("%d", i), stringutil.RandString(uint(valueSize)))
require.NoError(t, err)
}

// Revision: 9 -> 11 for delete keys with compared revision
//
// We need last compaction batch is no-op and all the tombstones should
// be deleted in previous compaction batch. So that we just lost the
// finishedCompactRev after panic.
for i := 9; i <= compactionBatchLimit+1; i++ {
rev := i - 5
key := fmt.Sprintf("%d", rev)

_, err := cli.Delete(ctx, key)
require.NoError(t, err)
}

require.NoError(t, clus.Procs[targetIdx].Failpoints().SetupHTTP(ctx, "compactBeforeSetFinishedCompact", `panic`))

_, err := cli.Compact(ctx, 11, clientv3.WithCompactPhysical())
require.Error(t, err)

require.NoError(t, clus.Procs[targetIdx].Restart(ctx))

// NOTE: We should not decrease the revision if there is no record
// about finished compact operation.
resp, err := cli.Get(ctx, fmt.Sprintf("%d", n))
require.NoError(t, err)
assert.GreaterOrEqual(t, resp.Header.Revision, int64(11))

// Revision 4 should be deleted by compaction.
resp, err = cli.Get(ctx, fmt.Sprintf("%d", 4))
require.NoError(t, err)
require.True(t, resp.Count == 0)

next := 20
for i := 12; i <= next; i++ {
_, err := cli.Put(ctx, fmt.Sprintf("%d", i), stringutil.RandString(uint(valueSize)))
require.NoError(t, err)
}

expectedRevision := next
for procIdx, proc := range clus.Procs {
cli = newClient(t, proc.EndpointsGRPC(), e2e.ClientConfig{})
resp, err := cli.Get(ctx, fmt.Sprintf("%d", next))
require.NoError(t, err)

assert.GreaterOrEqual(t, resp.Header.Revision, int64(expectedRevision),
fmt.Sprintf("LeaderIdx: %d, Current: %d", leaderIdx, procIdx))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checking of revisions is a little convoluted, would be good to clean it up to extract the most important properties. Not blocking for merging.

@ahrtr ahrtr merged commit dd4e35a into etcd-io:main Apr 22, 2024
44 checks passed
@ahrtr
Copy link
Member

ahrtr commented Apr 22, 2024

@fuweid can you please backport the PR to 3.5 and 3.4? Thanks

@fuweid fuweid deleted the repro-17780 branch April 23, 2024 03:10
@fuweid
Copy link
Member Author

fuweid commented Apr 23, 2024

@ahrtr will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release/v3.4 release/v3.5
Development

Successfully merging this pull request may close these issues.

7 participants