This repository has been archived by the owner on Feb 9, 2024. It is now read-only.
etcd panics with corrupted raft log errors #1645
Labels
kind/bug
Something isn't working
port/5.5
Requires port to version/5.5.x
port/6.1
Requires port to version/6.1.x
port/7.0
Requires port to version/7.0.x
priority/1
Medium priority
support-load
Mark issues that increase support load
Milestone
Describe the bug
We've seen a few times where etcd fails to start on a master node with the following panic:
The only way to get out of this situation is to rebuild the faulty etcd member like described in RedHat's KB article: https://access.redhat.com/solutions/4145881.
The issue was observed on etcd v3.3.11 (Gravity 5.5.40) and v3.3.20.
There are also related issues on etcd Github issue tracker, none of them seems to provide any resolution/RCA though, e.g. etcd-io/etcd#10951 or etcd-io/etcd#10817 (which seems to indicate the issue can be triggered by reboot).
To Reproduce
Unclear, it's intermitted and seems to happen under a couple of circumstances:
We should try to reproduce it on 5.5.40.
Expected behavior
Etcd doesn't panic.
Logs
Environment (please complete the following information):
Additional context
The text was updated successfully, but these errors were encountered: