Skip to content

Latest commit

 

History

History
46 lines (33 loc) · 2.53 KB

README.md

File metadata and controls

46 lines (33 loc) · 2.53 KB

ClickHouse Bit Flip

The tool to fix single bit flip error in binary data files of ClickHouse.

The tool makes the copy of origin file with .bak extension before to recovery data.
However, don't forget to make backup by himself

Checked on ClickHouse 19.16.2.2.

Installation

Download binary for you platform

or from sources (Go 1.13+):

git clone https://github.com/marliotto/clickhouse-bitflip.git
cd clickhouse-bitflip
go build

Usage

./clickhouse-bitflip filename

For example

Code: 40, e.displayText() = DB::Exception: Checksum doesn't match: corrupted data. Reference: 32eaca6117ab50a10b91e82b65de78e1. Actual: c44af50ae79ae7639ae96b59d8032bbc. Size of compressed block: 36031. The mismatch is caused by single bit flip in data block at byte 8732, bit 2. This is most likely due to hardware failure. If you receive broken data over network and the error does not repeat every time, this can be caused by bad RAM on network interface controller or bad controller itself or bad RAM on network switches or bad CPU on network switches (look at the logs on related network switches; note that TCP checksums don't help) or bad RAM on host (look at dmesg or kern.log for enormous amount of EDAC errors, ECC-related reports, Machine Check Exceptions, mcelog; note that ECC memory can fail if the number of errors is huge) or bad CPU on host. If you read data from disk, this can be caused by disk bit rott. This exception protects ClickHouse from data corruption due to hardware failures.: (while reading column Name): (while reading from part /var/lib/clickhouse/data/default/tbl/20200420_946605_946651_20/ from mark 0 with max_rows_to_read = 8192) (version 19.16.2.2 (official build)) (from 172.18.0.1:51644) (in query: SELECT Name FROM tbl)

Corrupted partition: /var/lib/clickhouse/data/default/tbl/20200420_946605_946651_20/
Corrupted column: Name

./clickhouse-bitflip /var/lib/clickhouse/data/default/tbl/20200420_946605_946651_20/Name.bin

To fix many files

find /var/lib/clickhouse/data/default/tbl/202004* -maxdepth 2 -type f -name '*.bin' -exec ./clickhouse-bitflip {} \; > result.log

Also, you can remove .bak files.

Links