Corruption: block checksum mismatch / during sync. #7766

thebalaa · 2018-01-31T19:42:22Z

I'm running:

Which Parity version?: 1.8.7 / 1.9.0

Which operating system?: Linux

How installed?: Installer

Are you fully synchronized?: no

Which network are you connected to?: ethereum

Did you try to restart the node?: yes

Starting from a clean slate latest stable and unstable versions 1.8.7 / 1.9.0 the following error is occuring at different block heights? Could this be faulty hardware?

2018-01-31 00:37:09  DB corrupted: Corruption: block checksum mismatch: expected 253734433, got 2018439782  in /home/balaa/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d/overlayrecent/db/282633.sst offset 11034697 size 596098. Repair will be triggered on next restart

====================

stack backtrace:
   0:     0x5617bafb2e0c - <no info>

Thread 'IO Worker #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch: expected 253734433, got 2018439782  in /home/balaa/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d/overlayrecent/db/282633.sst offset 11034697 size 596098"', /checkout/src/libcore/result.rs:906

This is a bug. Please report it at:

    https://github.com/paritytech/parity/issues/new

Aborted (core dumped)```

The text was updated successfully, but these errors were encountered:

5chdn · 2018-01-31T19:55:40Z

cc @andresilva this happens during sync

follow-up on #7334 cc @DeviateFish-2

also #7748

DeviateFish-2 · 2018-02-01T04:41:12Z

Another sample for the pile (running v1.9.0):

...
2018-01-29 22:38:23  Syncing #1464212 2180…d05a   319 blk/s 1816 tx/s  57 Mgas/s    142+ 4947 Qed  #1469311   22/25 peers     77 MiB chain   54 MiB db   42 MiB queue    8 MiB sync  RPC:  0 conn,  0 req/s,   0 µs
2018-01-29 22:38:33  Syncing #1465618 ad8b…9110   140 blk/s 1268 tx/s  83 Mgas/s   1126+ 5452 Qed  #1472200   22/25 peers     74 MiB chain   54 MiB db   43 MiB queue    9 MiB sync  RPC:  0 conn,  0 req/s,   0 µs
2018-01-29 22:38:43  Syncing #1468641 5d43…260c   302 blk/s 1736 tx/s  59 Mgas/s      0+ 3749 Qed  #1472391   22/25 peers     53 MiB chain   54 MiB db   28 MiB queue   13 MiB sync  RPC:  0 conn,  0 req/s,   0 µs
2018-01-29 22:38:53  Syncing #1470915 640a…7b17   228 blk/s 1617 tx/s  45 Mgas/s    780+ 5759 Qed  #1477472   21/25 peers     73 MiB chain   54 MiB db   41 MiB queue    8 MiB sync  RPC:  0 conn,  0 req/s,   0 µs

====================

stack backtrace:
   0:     0x55b5b84ff95c - backtrace::backtrace::trace::h88dff4dc401d81d6
   1:     0x55b5b84ff992 - backtrace::capture::Backtrace::new::hc1bdbce336b16eca
   2:     0x55b5b799fb49 - panic_hook::panic_hook::ha4f6f84d07d9cbbd

Thread 'IO Worker #2' panicked at 'DB flush failed.: Error(Msg("Corruption: block checksum mismatch: expected 3482696050, got 3888739091  in parity/chains/ethereum/db/906a34e69aec8c0d/archive/db/011705.sst offset 35210665 size 16261"), State { next_error: None, backtrace: None })', /checkout/src/libcore/result.rs:906

This is a bug. Please report it at:

    https://github.com/paritytech/parity/issues/new


====================

stack backtrace:

$ parity
Loading config file from /etc/parity/config.toml
2018-01-31 20:32:10  Starting Parity/v1.9.0-unstable-53ec114-20180125/x86_64-linux-gnu/rustc1.23.0
2018-01-31 20:32:10  Keys path parity/keys/Foundation
2018-01-31 20:32:10  DB path parity/chains/ethereum/db/906a34e69aec8c0d
2018-01-31 20:32:10  Path to dapps parity/dapps
2018-01-31 20:32:10  State DB configuration: archive +Fat +Trace
2018-01-31 20:32:10  Operating mode: active
2018-01-31 20:32:10  Configured for Foundation using Ethash engine
2018-01-31 20:32:10  Updated conversion rate to Ξ1 = US$1131.33 (105228024 wei/gas)

====================

stack backtrace:
   0:     0x5594fc78295c - backtrace::backtrace::trace::h88dff4dc401d81d6
   1:     0x5594fc782992 - backtrace::capture::Backtrace::new::hc1bdbce336b16eca
   2:     0x5594fbc22b49 - panic_hook::panic_hook::ha4f6f84d07d9cbbd

Thread 'main' panicked at 'failed to update version: Error(Msg("Corruption: block checksum mismatch: expected 3482696050, got 3888739091  inparity/chains/ethereum/db/906a34e69aec8c0d/archive/db/011705.sst offset 35210665 size 16261"), State { next_error: None, backtrace: None })', /checkout/src/libcore/result.rs:906

This is a bug. Please report it at:

    https://github.com/paritytech/parity/issues/new

$

(for the record, the lack of a second stack trace for the initial crash is not a mistake, there was no stack trace produced)

Again, going to reiterate, this is happening during a sync (full archive sync in my case, as seen in the output when restarting), and without any input at all. This is not the result of shutting down parity while it is syncing (inadvertently or otherwise). The corruption is happening during the sync process, which is causing parity to exit.

I'm capturing a full log right now, and will update this comment with it when it crashes.

As an aside... why does parity log to stderr?

[Edit] Here's a full log:
sync005.log

andresilva · 2018-02-01T09:59:55Z

It is possible that not closing RocksDB properly on shutdown could lead to some silent corruption, if there's no crash on shutdown you'll only see that corruption whenever RocksDB has to write to that block in the future (which might be the case here). This is my best explanation so far, we have a fix for RocksDB not being properly closed on shutdown which will be out in the next release and I'd like to see if these corruption issues disappear or reduce in frequency.

DeviateFish-2 · 2018-02-02T05:32:34Z

Why would that be the case here? The corruption is what's causing the shutdown of parity in these cases, not the other way around. Look at the logs: I'm not stopping parity and then encountering corruption on restart. Parity is crashing due to corruption.

I've done what I can do rule out hardware issues, but of course cannot completely rule them out. However, this seems to be a relatively frequent occurrence--#7334 was originally opened as a report of this behavior, and many of the issues closed as duplicates of it are also instances of crashes during initial sync.

These aren't cases where someone or something is forcibly terminating parity, and thus causing corruption due to an unclean shutdown. These are cases where parity itself is crashing, presumably due to corruption.

andresilva · 2018-02-02T10:53:46Z

Maybe I didn't explain myself properly. Every single shutdown of parity until #7695 was an unclean shutdown, regardless of whether you would see a crash or not. RocksDB would not be properly closed. The error you're seeing doesn't mean the database is being corrupted during sync, it means you're finding corrupted data during the sync, the corruption could have happened at any other time.

I'm not saying that there isn't any other cause for the corruption, but this is currently my best explanation since this was a violation of the RocksDB API (not closing the database properly), and assuming you use the RocksDB API properly it shouldn't lead to data corruption (short of hardware faults or RocksDB bugs). If you're willing to help please do a db kill and update to 1.9.1 and report back if you find this issue again.

DeviateFish-2 · 2018-02-03T08:10:37Z

How would it have happened at "another time" if this is a fresh sync (e.g. empty parity data directory)?

You can look at the logs I've provided. Literally every one of these samples I've provided has been following a parity db kill + removing the cache and network folders.

I've said this in literally every report that this is a clean sync, from scratch, with no pre-existing data.

Please fucking read a little better.

After running a parity db kill (and removing the cache and network folders), I tried to sync again this morning:

I'm attempting to run a full archive sync from scratch, with transaction tracing enabled. Relevant section of config.toml that reflects the current setup:

andresilva · 2018-02-03T12:34:22Z

@DeviateFish-2 Sorry, I wasn't aware of that, disregard what I said in that case.
Inside the db folder there should be a LOG file for RocksDB (chains/ethereum/db/906a34e69aec8c0d/overlayrecent/db/LOG or chains/ethereum/db/906a34e69aec8c0d/archive/db/LOG. This file is rewritten every time parity is started so could you share that LOG file right after you see a corruption crash? I'll try to raise the issue with RocksDB developers to see if they can point us to something. I haven't been able to reproduce this locally so it's hard for me to debug.

DeviateFish-2 · 2018-02-04T00:33:22Z

Here's the LOG file (renamed so Github will accept it) associated with the above parity log (sync005):

rocksdb005.log

@5chdn Could you re-open this issue?

5chdn · 2018-02-05T09:46:25Z

Yep. Thanks for the logs.

Emperornero · 2018-02-06T19:38:03Z

Is this issue being resolved anytime soon? I haven't been able to sync for MONTHS because of this issue and can confirm it's not a Hardware issue. I've had the same issue happen with 2 different SSDs and 6 different HDDs. Same problem no matter where the Parity database is stored.

Can provide more logs if needed.

5chdn · 2018-02-07T10:53:23Z

@Emperornero which version? on start up or during sync?

Emperornero · 2018-02-08T07:44:25Z

This has been happening since 1.7.6, no DB clears seem to fix the problem, currently on 1.9.2.

DWAK-ATTK · 2018-02-14T20:16:48Z

Ditto.

version Parity/v1.9.2-beta-0feb0bb-20180201/x86_64-linux-gnu/rustc1.23.0

Tried full sync of mainnet/foundation. It stalled out about 12 hours in (2.4m blocks). Issued clean shutdown (ctl-c). Shut the VM down until this morning.

Attempted restarting Parity this morning and received the same database corrupted database messages as everyone else.

parallels@ubuntu:~$ parity
2018-02-14 11:00:32  Starting Parity/v1.9.2-beta-0feb0bb-20180201/x86_64-linux-gnu/rustc1.23.0
2018-02-14 11:00:32  Keys path /home/parallels/.local/share/io.parity.ethereum/keys/Foundation
2018-02-14 11:00:32  DB path /home/parallels/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d
2018-02-14 11:00:32  Path to dapps /home/parallels/.local/share/io.parity.ethereum/dapps
2018-02-14 11:00:32  State DB configuration: fast
2018-02-14 11:00:32  Operating mode: active
2018-02-14 11:00:32  Configured for Foundation using Ethash engine
2018-02-14 11:00:32  DB corrupted: Invalid argument: You have to open all column families. Column families not opened: col4, col5, col6, col1, col3, col0, col2, attempting repair
2018-02-14 11:00:32  Updated conversion rate to Ξ1 = US$905.79 (131429600 wei/gas)
Client service error: Client(Database(Error(Msg("Received null column family handle from DB."), State { next_error: None, backtrace: None })))

andresilva · 2018-02-15T01:22:44Z

I have created an issue in RocksDB with the logs that @DeviateFish-2 provided (facebook/rocksdb#3509).

@DeviateFish-2 I understand that you've tried to rule out hardware issues by switching hard drives and memory.

@Emperornero did you try to rule out faulty memory? Could you run a memtest?

DWAK-ATTK · 2018-02-15T01:25:54Z

I don't know if it matters, but I'm running Parity in a Parallels 12 VM (Ubuntu 16.04) on a Macbook Pro running macOS 10.13.1

I've allocated 8GB ram to the VM (Parity appears to be a memory hog). With a 60GB vhd (on the laptop's internal SSD).

5chdn · 2018-03-23T10:33:03Z

Duplicate of #7748

ghost · 2019-11-06T19:29:54Z

I had this same issue for days. I fixed the issue by removing 1 stick of my ram. Now my laptop has only 1 stick of 8GB DDR3 installed, and Parity syncs without an issue. Wish this helps!

5chdn added F2-bug 🐞 The client fails to follow expected behavior. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible. M4-core ⛓ Core client code / Rust. labels Jan 31, 2018

5chdn added this to the 1.10 milestone Jan 31, 2018

5chdn closed this as completed Feb 2, 2018

5chdn reopened this Feb 5, 2018

5chdn changed the title ~~Corruption: block checksum mismatch~~ Corruption: block checksum mismatch / during sync. Feb 5, 2018

5chdn mentioned this issue Feb 12, 2018

DB corrupted #7851

Closed

This was referenced Feb 19, 2018

Bug: Crash - Corruption: block checksum mismatch #7927

Closed

Corrupt DB while restarting a sync #7898

Closed

5chdn modified the milestones: 1.10, 1.11 Mar 1, 2018

andresilva mentioned this issue Mar 1, 2018

DB corrupted: Invalid argument: You have to open all column families. #8002

Closed

Tbaut mentioned this issue Mar 23, 2018

Parity auto close #8156

Closed

5chdn added Z7-duplicate 🖨 Issue is a duplicate. Closer should comment with a link to the duplicate. and removed F2-bug 🐞 The client fails to follow expected behavior. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible. labels Mar 23, 2018

5chdn marked this as a duplicate of #7748 Mar 23, 2018

5chdn closed this as completed Mar 23, 2018

5chdn mentioned this issue Mar 26, 2018

DB corrupted #8224

Closed

5chdn mentioned this issue Apr 5, 2018

Verifier #3' panicked at 'index 288 out of range for slice of length 32 #8289

Closed

Tbaut mentioned this issue May 20, 2018

Parity MacOS pkg seems to be corrupted after performing the update located bottom-right in the web UI. #8607

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corruption: block checksum mismatch / during sync. #7766

Corruption: block checksum mismatch / during sync. #7766

thebalaa commented Jan 31, 2018

5chdn commented Jan 31, 2018 •

edited

Loading

DeviateFish-2 commented Feb 1, 2018 •

edited

Loading

andresilva commented Feb 1, 2018

DeviateFish-2 commented Feb 2, 2018

andresilva commented Feb 2, 2018

DeviateFish-2 commented Feb 3, 2018 •

edited

Loading

andresilva commented Feb 3, 2018

DeviateFish-2 commented Feb 4, 2018

5chdn commented Feb 5, 2018

Emperornero commented Feb 6, 2018 •

edited

Loading

5chdn commented Feb 7, 2018

Emperornero commented Feb 8, 2018

DWAK-ATTK commented Feb 14, 2018

andresilva commented Feb 15, 2018 •

edited

Loading

DWAK-ATTK commented Feb 15, 2018 •

edited

Loading

5chdn commented Mar 23, 2018

ghost commented Nov 6, 2019

Corruption: block checksum mismatch / during sync. #7766

Corruption: block checksum mismatch / during sync. #7766

Comments

thebalaa commented Jan 31, 2018

5chdn commented Jan 31, 2018 • edited Loading

DeviateFish-2 commented Feb 1, 2018 • edited Loading

andresilva commented Feb 1, 2018

DeviateFish-2 commented Feb 2, 2018

andresilva commented Feb 2, 2018

DeviateFish-2 commented Feb 3, 2018 • edited Loading

andresilva commented Feb 3, 2018

DeviateFish-2 commented Feb 4, 2018

5chdn commented Feb 5, 2018

Emperornero commented Feb 6, 2018 • edited Loading

5chdn commented Feb 7, 2018

Emperornero commented Feb 8, 2018

DWAK-ATTK commented Feb 14, 2018

andresilva commented Feb 15, 2018 • edited Loading

DWAK-ATTK commented Feb 15, 2018 • edited Loading

5chdn commented Mar 23, 2018

ghost commented Nov 6, 2019

5chdn commented Jan 31, 2018 •

edited

Loading

DeviateFish-2 commented Feb 1, 2018 •

edited

Loading

DeviateFish-2 commented Feb 3, 2018 •

edited

Loading

Emperornero commented Feb 6, 2018 •

edited

Loading

andresilva commented Feb 15, 2018 •

edited

Loading

DWAK-ATTK commented Feb 15, 2018 •

edited

Loading