Bitmarkd’s fast mode for initial data synchronisation #82

anhnguyenbitmark · 2019-08-07T03:03:43Z

Resolves #70

Current situation

When a bitmarkd node starts up for the first time, it will reach out for a number of nodes from network, and start the synchronisation to the current block height. In the current implementation, we treat no differences between initial data syncing with normal block syncing, it puts a lot of overhead to bitmarkd node to replay the whole blockchain transactions from beginning.

Analysis

Environment: Macbook pro 2017 (Intel i7 3.1 Ghz, 16Gb Ram).
Version: Latest code of bitmarkd on master with commit hash: 040e271

Time for syncing 500 blocks (10 - 510) on bitmarkd testnet blockchain: 272.545116 seconds (approximately 4.5 minutes). On average, each block took 0.54 seconds to process. With the current block height on testnet (27294), it will take approximately 4 hours to finish syncing before getting ready to work.
Let’s break down into detail steps:

[DEBUG] block: stored block: 597 - extracting header - time elapsed: 0.366004
[DEBUG] block: stored block: 597 - validate header version & difficity - time elapsed: 0.000021
[DEBUG] block: stored block: 597 - transaction validation - time elapsed: 0.000884
[DEBUG] block: stored block: 597 - save to leveldb - time elapsed: 0.000162
[DEBUG] block: stored block: 597 - total time elapsed: 0.367110

Extracting header takes up 99% of the processing time of a block.

Identify bottlenecks

blockrecord.ExtractHeader does these following things:

Extract header metadata
Calculate digest
Extract block body

Digest calculating is using argon2 to hash the whole block packed data. Argon 2 uses CPU to hash and much slower than other hashes. It’s the main bottleneck for processing a block now.

Can we eliminate digest calculating from block data processing?
Without knowing block digest, we cannot save block data to the db, nor verify linkage of chain. For normal syncing, we cannot verify the block data without calculating that. With fast syncing, since the initial blocks are nearly settle down, we don’t need to verify packed data for every block. Main idea of fast sync is to eliminate some checks of block data to speed up the block synchronisation. The most critical one is argon 2 hashing to calculate digest. We can rely to unpacked data of next header to get PreviousBlock field, it is digest of current block, based on block linkage rule.

Detect block forgery in fast syncing

As we rely on block data linkage to do fast sync, that means we need to trust the received block data. What happens if we have block forgery during syncing?
I borrow this idea from ethereum’s fast sync mode:

We can notice an interesting phenomenon during header verification. With a negligible probability of error, we can still guarantee the validity of the chain, only by verifying every K-th header, instead of each and every one. By selecting a single header at random out of every K headers to verify, we guarantee the validity of an N-length chain with the probability of (1/K)^(N/K) (i.e. we have 1/K chance to spot a forgery in K blocks, a verification that's repeated N/K times).
Let's define the negligible probability Pn as the probability of obtaining a 256 bit SHA3 collision (i.e. the hash Ethereum is built upon): 1/2^128. To honor the Ethereum security requirements, we need to choose the minimum chain length N (below which we verify every header) and maximum K verification batch size such as (1/K)^(N/K) <= Pn holds. Calculating this for various {N, K} pairs is pretty straightforward, a simple and lenient solution being http://play.golang.org/p/B-8sX_6Dq0.

N K N K N K N K
1024 43 1792 91 2560 143 3328 198
1152 51 1920 99 2688 152 3456 207
1280 58 2048 108 2816 161 3584 217
1408 66 2176 116 2944 170 3712 226
1536 74 2304 128 3072 179 3840 236
1664 82 2432 134 3200 189 3968 246

The above table should be interpreted in such a way, that if we verify every K-th header, after N headers the probability of a forgery is smaller than the probability of an attacker producing a SHA3 collision. It also means that if a forgery is indeed detected, the last N headers should be discarded as not safe enough. Any {N, K} pair may be chosen from the above table, and to keep the numbers reasonably looking, we chose N=2048, K=100. This will be fine tuned later after being able to observe network bandwidth/latency effects and possibly behavior on more CPU limited devices.

Mechanism

The goal of the the fast sync algorithm is to exchange processing power for bandwidth usage. Instead of processing the entire block-chain one link at a time, and replay all transactions that ever happened in history, fast syncing downloads the transaction receipts along the blocks, and pulls an entire recent state database. This allows a fast synced node to still retain its status an archive node containing all historical data for user queries (and thus not influence the network's health in general), but at the same time to reassemble a recent network state at a fraction of the time it would take full block processing.

An outline of the fast sync algorithm would be:

Similarly to classical sync, download the block headers and bodies that make up the blockchain
Instead of fully verify block header, it will verify for header metadata (difficulty, version, etc) but rely on block linkage data to get digest from next block instead of calculating itself from current block data.
Similarly to classical sync, verify all transaction receipts in block body.
Store the downloaded blockchain, along with the receipt chain, enabling all historical queries
When the chain reaches a recent enough state (head - 1024 blocks), mark the pivot point (head - 1024 blocks) as the current head and stop fast sync mode forever
Import all remaining blocks (1024) by fully processing them as in the classical sync

…digest of current block without hasing using argon2

…0th block

…ater than pivot point

araujobsd · 2019-09-25T01:36:28Z

Please, double check the conflicts!

anhnguyenbitmark · 2019-09-25T01:56:40Z

Thanks @araujobsd for reminding and @hxw for helping me resolve the conflicts.

anhnguyenbitmark added 13 commits July 24, 2019 14:40

Introduce nextPackedBlock, a next block packed data that we can know …

39de600

…digest of current block without hasing using argon2

Merge branch 'master' into features/fast-sync

21ae40c

fast sync by 2048 blocks, check for header's full validify by each 10…

ea862dd

…0th block

Determine pivot point to stop fast sync

653056e

[connector] fix bug: doing fast sync when current block number is gre…

b0ee4e9

…ater than pivot point

[bitmarkd][peer] fast sync configurable by option

af4bcc7

[peer] delete whole syncing cycle if detect a potential block forgery

0a3f57a

Merge branch 'master' into features/fast-sync

6d97a4e

[peer] check block forgery in fast sync mode

130d8de

[peer] disable fast sync when detecting block forgery

2b2adaf

[blockrecords] fix test compiling issue

96fe55e

[genesis] fix test compiling issue

b9a1125

[blockrecord] test for skipDigest param

d543086

anhnguyenbitmark requested review from hxw, jamieabc, jollyjoker992 and pieceofr August 7, 2019 03:03

Merge branch 'master' into features/fast-sync

387f1fc

anhnguyenbitmark requested a review from araujobsd September 25, 2019 01:57

hxw approved these changes Sep 25, 2019

View reviewed changes

hxw merged commit d2037da into master Sep 25, 2019

hxw deleted the features/fast-sync branch September 25, 2019 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitmarkd’s fast mode for initial data synchronisation #82

Bitmarkd’s fast mode for initial data synchronisation #82

anhnguyenbitmark commented Aug 7, 2019

araujobsd commented Sep 25, 2019

anhnguyenbitmark commented Sep 25, 2019

Bitmarkd’s fast mode for initial data synchronisation #82

Bitmarkd’s fast mode for initial data synchronisation #82

Conversation

anhnguyenbitmark commented Aug 7, 2019

Current situation

Analysis

Identify bottlenecks

Detect block forgery in fast syncing

Mechanism

araujobsd commented Sep 25, 2019

anhnguyenbitmark commented Sep 25, 2019