3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

Roguelazer · 2020-05-19T19:02:24Z

I am attempting to do a rolling upgrade from 3.3.20 to 3.3.21, and every node has failed with "etcdmain: walpb: crc mismatch". For the first node, I removed it and re-bootstrapped it successfully, and it can stop and start fine on 3.3.21 after re-bootstrapping, but now that the second node in the cluster is crashing on startup with the same error, I'm a little suspicious that this is actually a bug in 3.3.21.

None of these machines have ever suffered any hardware failure or unexpected shutdown, and I have been upgrading them to every 3.3.x patch release without ever seeing this issue before.

These nodes are mostly used with V2-protocol-speaking applications, so almost all the data is in the V2 store. Not sure if that makes a difference.

tangcong · 2020-05-20T04:39:43Z

pr #11888 introduce this issue, i have submitted a pr #11924 to fix this bug. @gyuho @jpbetz

tangcong · 2020-05-20T14:58:04Z

Once wal file is purged, whether it is an existing cluster or a new cluster, restarting etcd(v3.4.8/v3.3.21) will encounter this issue. Please wait for the new release version to solve this problem. thanks. @Roguelazer

wcollin · 2020-05-21T08:08:19Z

upgrade from 3.3.19 to 3.3.21 ，i sovle this issue by step，but i don't know why.:

remove etcd01 (version 3.3.19)
add etcd01 to cluster (version 3.3.19)
upgrade 3.3.19 to 3.3.21

tangcong · 2020-05-21T08:19:44Z

@wcollin wal file maybe has not been purged, if you restart etcd now, it will also be crash. v3.3.22,v3.4.9 is not found. @gyuho

wcollin · 2020-05-21T08:23:53Z

@tangcong thanks a lot ,i will fallback to 3.3.19 and waiting for new version.

Roguelazer · 2020-05-21T20:54:47Z

I have verified that 3.3.22 appears to fix this issue.

It does now print a bunch of other unexpected errors when restarting a node:

2020-05-21 20:49:38.996422 W | etcdserver: failed to apply request,took 36.66µs,request header:<ID:9858223652514886896 > lease_revoke:<id:08cf722e4c80f4b7>,resp size:31,err is lease not found
2020-05-21 20:49:38.996545 W | etcdserver: failed to apply request,took 6.129µs,request header:<ID:9858223652514886898 > lease_revoke:<id:08cf722e4c80f4c8>,resp size:31,err is lease not found
2020-05-21 20:49:38.996607 W | etcdserver: failed to apply request,took 3.356µs,request header:<ID:9858223652514886899 > lease_revoke:<id:08cf722e4c80f4bc>,resp size:31,err is lease not found
2020-05-21 20:49:38.996662 W | etcdserver: failed to apply request,took 3.7µs,request header:<ID:9858223652514886900 > lease_revoke:<id:08cf722e4c80f4c0>,resp size:31,err is lease not found
2020-05-21 20:49:38.996748 W | etcdserver: failed to apply request,took 4.63µs,request header:<ID:9858223652514886901 > lease_revoke:<id:08cf722e4c80f4c4>,resp size:31,err is lease not found

The node comes up successfully, though, and appears to function normally

tangcong · 2020-05-21T23:53:54Z

@Roguelazer it is normal behavior, it will print warn log when etcd server failed to apply request , it plays a key role in troubleshooting.

karolsteve · 2023-11-10T12:18:35Z

I'm getting the same issue. any news ?

jmhbnz · 2023-11-10T16:25:38Z

I'm getting the same issue. any news ?

Hi @karolsteve - etcd 3.3.21 is a three and a half year old release which is no longer supported by the etcd project. Countless bugs and security concerns have been addressed in later releases. Please upgrade to a modern etcd release as a soon as possible.

karolsteve · 2023-11-10T16:58:36Z

okay thanks @jmhbnz

tangcong mentioned this issue May 20, 2020

wal: fix crc mismatch crash bug #11924

Merged

gyuho closed this as completed in #11924 May 20, 2020

binoue mentioned this issue May 21, 2020

ignore etcd v3.3.21 cybozu-go/neco#983

Merged

masa213f mentioned this issue May 21, 2020

Downgrade etcd to v3.3.20 cybozu-go/cke#326

Merged

knisbet mentioned this issue Jun 2, 2020

etcd panics with corrupted raft log errors gravitational/gravity#1645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

Roguelazer commented May 19, 2020

tangcong commented May 20, 2020

tangcong commented May 20, 2020 •

edited

Loading

wcollin commented May 21, 2020 •

edited

Loading

tangcong commented May 21, 2020

wcollin commented May 21, 2020

Roguelazer commented May 21, 2020

tangcong commented May 21, 2020 •

edited

Loading

karolsteve commented Nov 10, 2023

jmhbnz commented Nov 10, 2023

karolsteve commented Nov 10, 2023

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

3.3.21 crashing on every node with "etcdmain: walpb: crc mismatch" #11918

Comments

Roguelazer commented May 19, 2020

tangcong commented May 20, 2020

tangcong commented May 20, 2020 • edited Loading

wcollin commented May 21, 2020 • edited Loading

tangcong commented May 21, 2020

wcollin commented May 21, 2020

Roguelazer commented May 21, 2020

tangcong commented May 21, 2020 • edited Loading

karolsteve commented Nov 10, 2023

jmhbnz commented Nov 10, 2023

karolsteve commented Nov 10, 2023

tangcong commented May 20, 2020 •

edited

Loading

wcollin commented May 21, 2020 •

edited

Loading

tangcong commented May 21, 2020 •

edited

Loading