-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bitmarkd’s fast mode for initial data synchronisation #82
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…digest of current block without hasing using argon2
…ater than pivot point
anhnguyenbitmark
requested review from
hxw,
jamieabc,
jollyjoker992 and
pieceofr
August 7, 2019 03:03
Please, double check the conflicts! |
Thanks @araujobsd for reminding and @hxw for helping me resolve the conflicts. |
hxw
approved these changes
Sep 25, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves #70
Current situation
When a bitmarkd node starts up for the first time, it will reach out for a number of nodes from network, and start the synchronisation to the current block height. In the current implementation, we treat no differences between initial data syncing with normal block syncing, it puts a lot of overhead to bitmarkd node to replay the whole blockchain transactions from beginning.
Analysis
Environment: Macbook pro 2017 (Intel i7 3.1 Ghz, 16Gb Ram).
Version: Latest code of bitmarkd on master with commit hash: 040e271
Time for syncing 500 blocks (10 - 510) on bitmarkd testnet blockchain: 272.545116 seconds (approximately 4.5 minutes). On average, each block took 0.54 seconds to process. With the current block height on testnet (27294), it will take approximately 4 hours to finish syncing before getting ready to work.
Let’s break down into detail steps:
Extracting header takes up 99% of the processing time of a block.
Identify bottlenecks
blockrecord.ExtractHeader
does these following things:Digest calculating is using argon2 to hash the whole block packed data. Argon 2 uses CPU to hash and much slower than other hashes. It’s the main bottleneck for processing a block now.
Can we eliminate digest calculating from block data processing?
Without knowing block digest, we cannot save block data to the db, nor verify linkage of chain. For normal syncing, we cannot verify the block data without calculating that. With fast syncing, since the initial blocks are nearly settle down, we don’t need to verify packed data for every block. Main idea of fast sync is to eliminate some checks of block data to speed up the block synchronisation. The most critical one is argon 2 hashing to calculate digest. We can rely to unpacked data of next header to get PreviousBlock field, it is digest of current block, based on block linkage rule.
Detect block forgery in fast syncing
As we rely on block data linkage to do fast sync, that means we need to trust the received block data. What happens if we have block forgery during syncing?
I borrow this idea from ethereum’s fast sync mode:
Mechanism
The goal of the the fast sync algorithm is to exchange processing power for bandwidth usage. Instead of processing the entire block-chain one link at a time, and replay all transactions that ever happened in history, fast syncing downloads the transaction receipts along the blocks, and pulls an entire recent state database. This allows a fast synced node to still retain its status an archive node containing all historical data for user queries (and thus not influence the network's health in general), but at the same time to reassemble a recent network state at a fraction of the time it would take full block processing.
An outline of the fast sync algorithm would be: