You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
On testing-large-01.aws-eu-central-1a.nimbus.prater, the Nimbus beacon node entered a restart loop and crashes on start:
{"lvl":"INF","ts":"2022-08-11 10:14:13.534+00:00","msg":"Loading block DAG from database","topics":"beacnde","path":"/data/beacon-node-prater-testing/data/db"}
/data/beacon-node-prater-testing/repo/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(456) main
/data/beacon-node-prater-testing/repo/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(449) NimMain
/data/beacon-node-prater-testing/repo/beacon_chain/nimbus_beacon_node.nim(2177) main
/data/beacon-node-prater-testing/repo/beacon_chain/nimbus_beacon_node.nim(2045) handleStartUpCmd
/data/beacon-node-prater-testing/repo/beacon_chain/nimbus_beacon_node.nim(1863) doRunBeaconNode
/data/beacon-node-prater-testing/repo/beacon_chain/nimbus_beacon_node.nim(646) init
/data/beacon-node-prater-testing/repo/beacon_chain/nimbus_beacon_node.nim(191) loadChainDag
/data/beacon-node-prater-testing/repo/beacon_chain/consensus_object_pools/blockchain_dag.nim(836) init
/data/beacon-node-prater-testing/repo/vendor/nim-stew/stew/results.nim(756) expect
/data/beacon-node-prater-testing/repo/vendor/nim-stew/stew/results.nim(348) raiseResultDefect
Error: unhandled exception: not nil [ResultDefect]
The problem occurs because in the DB, when loading blocks from head => finalizedHead, one block is missing from the DB.
Examining the logs reveals that the missing block actually got judged as INVALID by the EL. Whether this is due to the config issue is a separate problem.
{"lvl":"DBG","ts":"2022-08-11 09:42:07.883+00:00","msg":"newPayload: succeeded","parentHash":"893a5a27","blockHash":"1642d9bc","blockNumber":7384505,"payloadStatus":2}
{"lvl":"DBG","ts":"2022-08-11 09:42:07.883+00:00","msg":"runQueueProcessingLoop: execution payload invalid","executionPayloadStatus":2,"blck":{"blck":{"slot":3641909,"proposer_index":330973,"parent_root":"c8752b72","state_root":"692dbc94","eth1data":{"deposit_root":"a9256931ff65af92d47c829fced0582b07c055019c468be5c4b20190b3745ee0","deposit_count":169791,"block_hash":"4aecd111f7df76364d8c8cf46917ee938543ebd6f7c3026e1d3c5fecbe412a19"},"graffiti":"teku/v22.7.0","proposer_slashings_len":0,"attester_slashings_len":0,"attestations_len":128,"deposits_len":0,"voluntary_exits_len":0,"sync_committee_participants":391},"signature":"8efaf57a"}}
{"lvl":"DBG","ts":"2022-08-11 09:42:07.883+00:00","msg":"markBlockInvalid","topics":"chaindag","blck":"7b63b0d2:3641909"}
{"lvl":"NOT","ts":"2022-08-11 09:42:07.884+00:00","msg":"Received invalid block","topics":"requman","peer":"16U*Sr8Grn","blocks":"[7b63b0d2, 7b63b0d2, 7b63b0d2, c11444f7, c11444f7, c11444f7]","peer_score":600}
{"lvl":"DBG","ts":"2022-08-11 09:42:07.884+00:00","msg":"Peer was removed from PeerPool due to low score","topics":"beacnde","peer":"16U*Sr8Grn","peer_score":-400,"score_low_limit":0,"score_high_limit":1000}
markBlockInvalid needs to better handle the case and also delete all descendents from the DB, and also revert the DAG head back to the parent, if this happens.
Note that it is probably safe to ignore markBlockInvalid for any slot <= dag.finalizedHead.slot, as DAG updateHead won't allow reverting to anything before that.
Additional context
This was on a server where multiple CL were configured against the same EL. It is unlikely for a block so deep to be reported as INVALID during normal operation. However, if it happens, it is still a bug that should not corrupt the CL database.
The text was updated successfully, but these errors were encountered:
Describe the bug
On
testing-large-01.aws-eu-central-1a.nimbus.prater
, the Nimbus beacon node entered a restart loop and crashes on start:The problem occurs because in the DB, when loading blocks from
head
=>finalizedHead
, one block is missing from the DB.Examining the logs reveals that the missing block actually got judged as
INVALID
by the EL. Whether this is due to the config issue is a separate problem.markBlockInvalid
needs to better handle the case and also delete all descendents from the DB, and also revert the DAG head back to the parent, if this happens.Note that it is probably safe to ignore
markBlockInvalid
for any slot <=dag.finalizedHead.slot
, as DAGupdateHead
won't allow reverting to anything before that.To Reproduce
Steps to reproduce the behavior:
testing-large-01.aws-eu-central-1a.nimbus.prater
v22.7.0-7cac6f-stateofus
Additional context
This was on a server where multiple CL were configured against the same EL. It is unlikely for a block so deep to be reported as
INVALID
during normal operation. However, if it happens, it is still a bug that should not corrupt the CL database.The text was updated successfully, but these errors were encountered: