Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent data in etcd 3.5.4 #14139

Closed
chaohan1 opened this issue Jun 21, 2022 · 16 comments
Closed

inconsistent data in etcd 3.5.4 #14139

chaohan1 opened this issue Jun 21, 2022 · 16 comments
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. stage/tracked type/bug

Comments

@chaohan1
Copy link

What happened?

I encounter a serious data inconsistent problem when using etcd 3.5.4。I have a three node etcd cluster, the problem will be happened occasionally when three node reboot at the same time. In the CHANGELOG-3.5, I see the data inconsistent problem was fixed in pull/13908, but I still encounter the data inconsistent problem。

What did you expect to happen?

There is no problem when three node reboot at the same time.

How can we reproduce it (as minimally and precisely as possible)?

The problem maybe happen when the three node of etcd cluster reboot at the same time.

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
# paste output here

$ etcdctl version
# paste output here

Etcd configuration (command line flags or environment variables)

paste your configuration here

Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

No response

@ahrtr
Copy link
Member

ahrtr commented Jun 21, 2022

Would you mind to provide the following info?

  1. The complete log of all the three members;
  2. etcdctl endpoint status -w json --cluster;
  3. It would be helpful if you could provide the the zip package of all the data directories of all members. Or check the data of all the three members using etcd-dump-logs yourself.
  4. How often do you see this issue? Could you easily reproduce this problem?

@ahrtr ahrtr added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. stage/tracked labels Jun 21, 2022
@fanzetian
Copy link

+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+----------- ---------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLI ED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+----------- ---------+--------+
| http://10.37.20.20:2379 | 24c5cf6cc2f1d404 | 3.4.18 | 195 MB | false | false | 3 | 1688742 | 1688742 | |
| http://10.37.20.18:2379 | 51b11170a6554c23 | 3.4.18 | 273 MB | false | false | 3 | 1688742 | 1688742 | |
| http://10.37.20.19:2379 | 666fa0252211b439 | 3.4.18 | 195 MB | true | false | 3 | 1688742 | 1688742 | |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+-----------

@fanzetian
Copy link

k8s resourceversion is :
[root@master1 data]# kubectl get po -n monitor magiccube-prometheus-node-exporter-fkpdd -oyaml|grep -i resourcev
resourceVersion: "25520"
[root@master1 data]# kubectl get po -n monitor magiccube-prometheus-node-exporter-fkpdd -oyaml|grep -i resourcev
resourceVersion: "25520"
[root@master1 data]# kubectl get po -n monitor magiccube-prometheus-node-exporter-fkpdd -oyaml|grep -i resourcev
resourceVersion: "25067"
[root@master1 data]# kubectl get po -n monitor magiccube-prometheus-node-exporter-fkpdd -oyaml|grep -i resourcev
resourceVersion: "25067"
[root@master1 data]# kubectl get po -n monitor magiccube-prometheus-node-exporter-fkpdd -oyaml|grep -i resourcev
resourceVersion: "25520"

@chaohan1
Copy link
Author

Would you mind to provide the following info?

  1. The complete log of all the three members;
  2. etcdctl endpoint status -w json --cluster;
  3. It would be helpful if you could provide the the zip package of all the data directories of all members. Or check the data of all the three members using etcd-dump-logs yourself.
  4. How often do you see this issue? Could you easily reproduce this problem?

As the Information provided by@fanzetian。

  1. How often do you see this issue? Could you easily reproduce this problem?
    The problem is happened in our k8s cluster when three etcd node reboot at the same time, then we can see one of etcd node data is inconsistent with other two etcd nodes. If you want to reproduce the problem ,you can try this:
    1. Increase the load of kube-apiserver,such as adding lots of Pods,or deleting lots of Pods;
    2. Reboot three etcd node at the same time;
    3. Then the problem maybe happen;
    4. My k8s version is 1.19.3;

@ahrtr
Copy link
Member

ahrtr commented Jun 22, 2022

The title mentions the etcd version is 3.5.4, but actually it's 3.4.18?

@chaohan1
Copy link
Author

chaohan1 commented Jun 22, 2022

The title mentions the etcd version is 3.5.4, but actually it's 3.4.18?

I'm sorry.

We first encounter the inconsistent data problem in etcd 3.5.0, when i see #13908, i think the problem was fixed, so we change the etcd version to 3.5.4, but the problem is still happen.

We alse test in etcd 3.4.18, and the inconsistent data problem happen too, so i don't know which version of etcd i can use, or the mehtod of our testing has something wrong.

We are reproducing the problem in etcd 3.5.4, and i will upload the information when we reproduce it,please wait for a moment.

@ahrtr
Copy link
Member

ahrtr commented Jun 22, 2022

We are reproducing the problem in etcd 3.5.4, and i will upload the information when we reproduce it,please wait for a moment.

Thanks.

Please also provide the configurations (command-line parameters) for the etcd members.

@fanzetian
Copy link

fanzetian commented Jun 22, 2022

We are reproducing the problem in etcd 3.5.4, and i will upload the information when we reproduce it,please wait for a moment.

Thanks.

Please also provide the configurations (command-line parameters) for the etcd members.

@ahrtr

endpoint status info, sometimes the flower's raftIndex is bigger than leader

etcd_cluster

kubectl get pod, the pod's resourceVersion is not incremental

[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81888"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81672"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81888"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81888"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81888"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81888"
[root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev
resourceVersion: "81672"

etcd config:

etcd data and log:
https://wx.mail.qq.com/ftn/download?func=3&key=cfc86c37a8371776fdbb1d3761323763a793d63c63323763151855560104070607031e040601544e040252554e0b0e50561a0206550451570002065400043963554350534e0119561e031d4d0a4223c6cb10ca04587e24fa1bfd633fbf51f18f3c0c25&code=0737c27c&k=cfc86c37a8371776fdbb1d3761323763a793d63c63323763151855560104070607031e040601544e040252554e0b0e50561a0206550451570002065400043963554350534e0119561e031d4d0a4223c6cb10ca04587e24fa1bfd633fbf51f18f3c0c25&fweb=1&cl=1

@serathius
Copy link
Member

Can you give us a some information about registry.dahuatech.com/cube/cube-etcd:3.5.4 image? is it packaged etcd binary, a custom build or fork?

Can you run docker run registry.dahuatech.com/cube/cube-etcd:3.5.4 /usr/local/bin/etcd --version and give us result?

On the corruption could you enable corruption detection mechanisms? Just add flags --experimental-initial-corrupt-check=true and --experimental-corrupt-check-time=5m

@fanzetian
Copy link

Can you give us a some information about registry.dahuatech.com/cube/cube-etcd:3.5.4 image? is it packaged etcd binary, a custom build or fork?

Can you run docker run registry.dahuatech.com/cube/cube-etcd:3.5.4 /usr/local/bin/etcd --version and give us result?

On the corruption could you enable corruption detection mechanisms? Just add flags --experimental-initial-corrupt-check=true and --experimental-corrupt-check-time=5m

/usr/local/bin/etcd --version
etcd Version: 3.5.4
Git SHA: 08407ff
Go Version: go1.16.15
Go OS/Arch: linux/amd64

@chaohan1
Copy link
Author

Can you give us a some information about registry.dahuatech.com/cube/cube-etcd:3.5.4 image? is it packaged etcd binary, a custom build or fork?

Can you run docker run registry.dahuatech.com/cube/cube-etcd:3.5.4 /usr/local/bin/etcd --version and give us result?

On the corruption could you enable corruption detection mechanisms? Just add flags --experimental-initial-corrupt-check=true and --experimental-corrupt-check-time=5m

Can you give us a some information about registry.dahuatech.com/cube/cube-etcd:3.5.4 image? is it packaged etcd binary, a custom build or fork?
Can you run docker run registry.dahuatech.com/cube/cube-etcd:3.5.4 /usr/local/bin/etcd --version and give us result?
On the corruption could you enable corruption detection mechanisms? Just add flags --experimental-initial-corrupt-check=true and --experimental-corrupt-check-time=5m

/usr/local/bin/etcd --version etcd Version: 3.5.4 Git SHA: 08407ff Go Version: go1.16.15 Go OS/Arch: linux/amd64

Our etcd binary is getted from etcd github, the link is: https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz

@ahrtr
Copy link
Member

ahrtr commented Jun 22, 2022

{"level":"info","ts":"2022-06-22T16:01:42.030+0800","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.4","git-sha":"08407ff76","go-version":"go1.16.15" ...}

Based on the log, it's the offical 3.5.4 build. @serathius

Thanks @fanzetian for the feedback. The db file is somehow corrupted. See below,

$ ./etcd-dump-db iterate-bucket ~/Downloads/etcd-3.5.4/etcd1/data/member/snap/db  meta --decode
panic: freepages: failed to get all reachable pages (page 3906931167209480566: out of bounds: 34529)

goroutine 41 [running]:
go.etcd.io/bbolt.(*DB).freepages.func2()
	/Users/wachao/go/gopath/pkg/mod/go.etcd.io/bbolt@v1.3.6/db.go:1056 +0x99
created by go.etcd.io/bbolt.(*DB).freepages
	/Users/wachao/go/gopath/pkg/mod/go.etcd.io/bbolt@v1.3.6/db.go:1054 +0x1f6

What's the filesystem you are using? I suspect it's a filesystem issue. Note that we received similar issues on the filesystem. Refer to issuecomment-1162774092 and issues/13406

The three members do have inconsistent data,

# etcd1
nodeID=629e9613e9db0cf2 clusterID=f7fb4893fb1b1d72 term=3 commitIndex=119019 vote=638c542f0bcf841

# etcd2
nodeID=2abe9d5813f1b070 clusterID=f7fb4893fb1b1d72 term=3 commitIndex=120703 vote=0

# etcd3
nodeID=638c542f0bcf841 clusterID=f7fb4893fb1b1d72 term=3 commitIndex=121972 vote=638c542f0bcf841

Could you double confirm whether the revision, raftIndex and raftAppliedIndex returned by etcdctl endpoint status -w json --cluster can be converged to the same values for different members after stopping the test traffic?

@ahrtr
Copy link
Member

ahrtr commented Jun 22, 2022

One more question, can you easily reproduce this issue when rebooting all 3 members at the same time?

I tried a couple of times, but couldn't reproduce this issue. But the traffic in my environment may not be that big as yours.

@chaohan1
Copy link
Author

One more question, can you easily reproduce this issue when rebooting all 3 members at the same time?

I tried a couple of times, but couldn't reproduce this issue. But the traffic in my environment may not be that big as yours.

@ahrtr @serathius thank you very much, we have found the root cause for the problem. As your suggestion, it is readly due to our filesystem(we use xfs filesystem). Our filesystem was damaged when node reboot. The problem was not happen when we repari the xfs filesystem. So, it's not etcd problem.

Thanks for a lot again, wo solve the problem troubled me many days due to your help. @ahrtr @serathius

It's not etcd problem ,can i close the issue?

@ahrtr
Copy link
Member

ahrtr commented Jun 22, 2022

Great. Feel free to raise a new issue if you run into this issue again.

@ahrtr ahrtr closed this as completed Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. stage/tracked type/bug
Development

No branches or pull requests

4 participants