Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: lost raft log after restart #10951

Closed
nolouch opened this issue Jul 29, 2019 · 4 comments
Closed

raft: lost raft log after restart #10951

nolouch opened this issue Jul 29, 2019 · 4 comments
Labels

Comments

@nolouch
Copy link
Contributor

nolouch commented Jul 29, 2019

The server is down for a while and restarts the machine, and configured nodelalloc.
fstab:

UUID=34e9f3c4-bf7e-4285-b308-e32d44a04547   /   xfs defaults    0   0
/dev/vdc1   /opt    ext4    nodelalloc,noatime  2   2

etcd version:
commit hash : 2b3aa7e

logs:

[2019/07/26 17:04:13.836 +08:00] [INFO] [server.go:145] ["start embed etcd"]
[2019/07/26 17:04:13.836 +08:00] [INFO] [systime_mon.go:25] ["start system time monitor"]
2019/07/26 17:04:13.836 log.go:90: [info] embed: [pprof is enabled under /debug/pprof]
2019/07/26 17:04:13.839 log.go:90: [info] etcdserver: [recovered store from snapshot at index 2800028]
2019/07/26 17:04:13.839 log.go:90: [info] mvcc: [restore compact to 2833936]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [name = pd_gh-blade-bike-mbkmola-root01]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [data dir = /opt/tidb/deploy/data.pd]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [member dir = /opt/tidb/deploy/data.pd/member]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [heartbeat = 500ms]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [election = 3000ms]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [snapshot count = 100000]
2019/07/26 17:04:13.843 log.go:90: [info] etcdserver: [advertise client URLs = http://10.22.84.55:2379]
2019/07/26 17:04:14.200 log.go:90: [info] etcdserver: [restarting member c1f3e6aabe11d1a5 in cluster 55a5b307da3bd315 at commit index 2838365]
2019/07/26 17:04:14.202 raft.go:656: [info] c1f3e6aabe11d1a5 became follower at term 2
2019/07/26 17:04:14.202 raft.go:364: [info] newRaft c1f3e6aabe11d1a5 [peers: [1b4993502b2818d1,236037faf02fe450,c1f3e6aabe11d1a5], term: 2, commit: 2838365, applied: 2800028, lastindex: 2838366, lastterm: 2]
2019/07/26 17:04:14.202 log.go:90: [info] etcdserver/api: [enabled capabilities for version 3.3]
2019/07/26 17:04:14.202 log.go:90: [info] etcdserver/membership: [added member 1b4993502b2818d1 [http://10.4.136.54:2380] to cluster 55a5b307da3bd315 from store]
2019/07/26 17:04:14.202 log.go:90: [info] etcdserver/membership: [added member 236037faf02fe450 [http://10.46.72.2:2380] to cluster 55a5b307da3bd315 from store]
2019/07/26 17:04:14.202 log.go:90: [info] etcdserver/membership: [added member c1f3e6aabe11d1a5 [http://10.22.84.55:2380] to cluster 55a5b307da3bd315 from store]
2019/07/26 17:04:14.202 log.go:90: [info] etcdserver/membership: [set the cluster version to 3.3 from store]
2019/07/26 17:04:14.204 log.go:90: [info] mvcc: [restore compact to 2833936]
2019/07/26 17:04:14.206 log.go:86: [warning] auth: [simple token is not cryptographically signed]
2019/07/26 17:04:14.207 log.go:90: [info] rafthttp: [starting peer 1b4993502b2818d1...]
2019/07/26 17:04:14.207 log.go:90: [info] rafthttp: [started HTTP pipelining with peer 1b4993502b2818d1]
2019/07/26 17:04:14.209 log.go:90: [info] rafthttp: [started streaming with peer 1b4993502b2818d1 (writer)]
2019/07/26 17:04:14.210 log.go:90: [info] rafthttp: [started streaming with peer 1b4993502b2818d1 (writer)]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [started peer 1b4993502b2818d1]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [started streaming with peer 1b4993502b2818d1 (stream MsgApp v2 reader)]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [started streaming with peer 1b4993502b2818d1 (stream Message reader)]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [added peer 1b4993502b2818d1]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [starting peer 236037faf02fe450...]
2019/07/26 17:04:14.212 log.go:90: [info] rafthttp: [started HTTP pipelining with peer 236037faf02fe450]
2019/07/26 17:04:14.213 log.go:90: [info] rafthttp: [started streaming with peer 236037faf02fe450 (writer)]
2019/07/26 17:04:14.213 log.go:90: [info] rafthttp: [started streaming with peer 236037faf02fe450 (writer)]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [started peer 236037faf02fe450]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [started streaming with peer 236037faf02fe450 (stream MsgApp v2 reader)]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [started streaming with peer 236037faf02fe450 (stream Message reader)]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [peer 1b4993502b2818d1 became active]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 1b4993502b2818d1 (stream Message reader)]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [added peer 236037faf02fe450]
2019/07/26 17:04:14.215 log.go:90: [info] embed: [c1f3e6aabe11d1a5 starting with cors ["*"]]
2019/07/26 17:04:14.215 log.go:90: [info] embed: [c1f3e6aabe11d1a5 starting with host whitelist ["*"]]
2019/07/26 17:04:14.215 log.go:90: [info] etcdserver: [starting server... [version: 3.3.0+git, cluster version: 3.3]]
2019/07/26 17:04:14.215 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 1b4993502b2818d1 (stream MsgApp v2 reader)]
2019/07/26 17:04:14.217 log.go:90: [info] embed: [listening for peers on  10.22.84.55:2380]
2019/07/26 17:04:14.217 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 1b4993502b2818d1 (stream Message writer)]
2019/07/26 17:04:14.217 log.go:90: [info] rafthttp: [peer 236037faf02fe450 became active]
2019/07/26 17:04:14.218 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 236037faf02fe450 (stream Message writer)]
2019/07/26 17:04:14.218 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 236037faf02fe450 (stream MsgApp v2 writer)]
2019/07/26 17:04:14.218 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 1b4993502b2818d1 (stream MsgApp v2 writer)]
2019/07/26 17:04:14.219 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 236037faf02fe450 (stream MsgApp v2 reader)]
2019/07/26 17:04:14.219 log.go:90: [info] rafthttp: [established a TCP streaming connection with peer 236037faf02fe450 (stream Message reader)]
2019/07/26 17:04:14.219 log.go:191: [panic] tocommit(2838367) is out of range [lastIndex(2838366)]. Was the raft log corrupted, truncated, or lost?
@jingyih
Copy link
Contributor

jingyih commented Jul 29, 2019

Is there a way to reproduce this?

To mitigate, remove this member from cluster, add a new member with empty data dir.

@nolouch
Copy link
Contributor Author

nolouch commented Jul 31, 2019

Unfortunately, I have not found a way to reproduce.

@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this as completed Apr 28, 2020
@ZzzJing
Copy link

ZzzJing commented May 22, 2020

We meet this problem at v3.4.3, does anybody know why this happend?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants