-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistent data in etcd 3.5.4 #14139
Comments
Would you mind to provide the following info?
|
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+----------- ---------+--------+ |
k8s resourceversion is : |
As the Information provided by@fanzetian。
|
The title mentions the etcd version is 3.5.4, but actually it's 3.4.18? |
I'm sorry. We first encounter the inconsistent data problem in etcd 3.5.0, when i see #13908, i think the problem was fixed, so we change the etcd version to 3.5.4, but the problem is still happen. We alse test in etcd 3.4.18, and the inconsistent data problem happen too, so i don't know which version of etcd i can use, or the mehtod of our testing has something wrong. We are reproducing the problem in etcd 3.5.4, and i will upload the information when we reproduce it,please wait for a moment. |
Thanks. Please also provide the configurations (command-line parameters) for the etcd members. |
endpoint status info, sometimes the flower's raftIndex is bigger than leader kubectl get pod, the pod's resourceVersion is not incremental [root@master3 home]# kubectl get po -n monitor magiccube-prometheus-node-exporter-zt25k -oyaml |grep -i resourcev etcd config:
|
Can you give us a some information about Can you run On the corruption could you enable corruption detection mechanisms? Just add flags |
/usr/local/bin/etcd --version |
Our etcd binary is getted from etcd github, the link is: https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz |
Based on the log, it's the offical 3.5.4 build. @serathius Thanks @fanzetian for the feedback. The db file is somehow corrupted. See below,
What's the filesystem you are using? I suspect it's a filesystem issue. Note that we received similar issues on the filesystem. Refer to issuecomment-1162774092 and issues/13406 The three members do have inconsistent data,
Could you double confirm whether the |
One more question, can you easily reproduce this issue when rebooting all 3 members at the same time? I tried a couple of times, but couldn't reproduce this issue. But the traffic in my environment may not be that big as yours. |
@ahrtr @serathius thank you very much, we have found the root cause for the problem. As your suggestion, it is readly due to our filesystem(we use xfs filesystem). Our filesystem was damaged when node reboot. The problem was not happen when we repari the xfs filesystem. So, it's not etcd problem. Thanks for a lot again, wo solve the problem troubled me many days due to your help. @ahrtr @serathius It's not etcd problem ,can i close the issue? |
Great. Feel free to raise a new issue if you run into this issue again. |
What happened?
I encounter a serious data inconsistent problem when using etcd 3.5.4。I have a three node etcd cluster, the problem will be happened occasionally when three node reboot at the same time. In the CHANGELOG-3.5, I see the data inconsistent problem was fixed in pull/13908, but I still encounter the data inconsistent problem。
What did you expect to happen?
There is no problem when three node reboot at the same time.
How can we reproduce it (as minimally and precisely as possible)?
The problem maybe happen when the three node of etcd cluster reboot at the same time.
Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
No response
The text was updated successfully, but these errors were encountered: