Multithreaded snapshot creation #9239

ngotchac · 2018-07-27T12:23:10Z

This PR introduces multithreaded snapshot creation. The idea is that a snapshot will be divided in N parts, and the user can choose to multithread the process into T threads, each thread processing N / T parts of the snapshot.

For now, it would be divided into 16 different parts. The more parts, the bigger the snapshot will be, since there are no deduplication between parts. On Kovan, the snapshot's size increased from 1GB to 1.1GB, so a 10% increase. I still need to run this on Foundation.

Regarding performance gains, here's a little graph:

The time spent creating the snapshot is not linear with the number of threads, but with sqrt(num_threads).

This PR adds thus a CLI argument --snapshot-threads which specifies the number of threads.
If this is accepted, the snapshot manifest could be edited in a future PR to specify the differents sub-parts of the state snapshot, so that the snapshot recovery would also work in parrallel.

It should be discussed whether this performance increase (in creating, and in a near future in restoring) outweighs the size increase.

Update : on Mainnet, splitting the snapshot into 16 subparts increases the number of chunks by only 2%, from 1818 to 1859 chunks. I was able to produce a full snapshot on 8 threads in 1h10min instead of the previously 5h.

rphmeier · 2018-07-27T13:21:26Z

@ngotchac related to #8565 ?

ngotchac · 2018-07-27T13:28:03Z

@rphmeier Kind of, but it's really a first step that shows what performance improvements you can get, vs. the size increase drawback. (It doesn't introduce any new snapshot format, yet)

rphmeier · 2018-07-27T13:51:39Z

Does it produce bit-for-bit the same snapshot regardless of the number of threads? That's a property of snapshots which we want to keep

ngotchac · 2018-07-27T13:59:40Z

Yes it does ! It won't if you change the SNAPSHOT_SUBPART variable (which for now is 16), but if you run 1,4 or 16 threads it will produce the same chunks, so the same snapshot.

ordian · 2018-07-31T11:43:15Z

@ngotchac I'm going to take a look after paritytech/parity-common#13 (looking at it now) is merged, since there are too many distracting changes associated with the branch = "ng-fix-triedb-seek" switch.

ordian

LGTM, I've left some code style comments

ordian · 2018-08-03T11:15:31Z

ethcore/src/snapshot/mod.rs

-			block_guard.join().map(|block_hashes| (state_hashes, block_hashes))
-		})
+		// The number of threads must be between 1 and SNAPSHOT_SUBPARTS
+		let num_threads: usize = cmp::max(1, cmp::min(processing_threads, SNAPSHOT_SUBPARTS));


Maybe use assert!(processing_threads >= 1, "...") instead of this cmp::max, since it would be a logical mistake to pass 0 as processing_threads?

ordian · 2018-08-03T11:42:57Z

ethcore/src/snapshot/mod.rs

+
+				for subpart_chunk in subparts_c.chunks(num_threads) {
+					if subpart_chunk.len() > thread_idx {
+						let part = subpart_chunk[thread_idx];


Maybe something like this would be easier to read (and we don't need to allocate subparts):

let state_guard = scope.spawn(move || -> Result<Vec<H256>, Error> { let mut chunk_hashes = Vec::new(); for part in (thread_idx..SNAPSHOT_SUBPARTS).step_by(num_threads) { ... } }

Note, that step_by was only stabilized in rust 1.28.

ordian · 2018-08-03T11:55:15Z

ethcore/src/snapshot/mod.rs

+		for guard in state_guards {
+			let mut part_state_hashes = guard.join()?.clone();
+			state_hashes.append(&mut part_state_hashes);
+		}


We can avoid cloning:

let mut state_hashes = Vec::new(); for guard in state_guards { let part_state_hashes = guard.join()?; state_hashes.extend(part_state_hashes); }

extend expects a mutable reference :/

Hm, not sure what you mean: http://play.rust-lang.org/?gist=855f74c9c518bb8777179e8e3daee376&version=stable&mode=debug&edition=2015

https://doc.rust-lang.org/std/vec/struct.Vec.html#impl-Extend%3CT%3E

Oups, you're right, sorry.

ordian · 2018-08-03T13:23:03Z

ethcore/src/snapshot/mod.rs

+	let mut seek_to = None;
+
+	if let Some(part) = part {
+		let part_offset = 256 / SNAPSHOT_SUBPARTS;


Maybe name 256? I don't like magic constants.

ethcore/src/snapshot/mod.rs

@@ -263,10 +318,12 @@ impl<'a> StateChunker<'a> {

 /// Walk the given state database starting from the given root,
 /// creating chunks and writing them out.
+/// `part` is a number between 0 and 15, which describe which part of


ordian · 2018-08-03T13:32:45Z

ethcore/src/snapshot/mod.rs

+		seek_from[0] = (part * part_offset) as u8;
+		account_iter.seek(&seek_from)?;
+
+		// Set the upper-bond, except for the last part


typo: upper bound

ordian · 2018-08-03T13:37:06Z

ethcore/src/snapshot/mod.rs

+		account_iter.seek(&seek_from)?;
+
+		// Set the upper-bond, except for the last part
+		if part < SNAPSHOT_SUBPARTS - 1 {


we could get rid of this if, if we make seek_to inclusive upper bound, e.g. the check would be

if account_key[0] > seek_to { break; }

But seek_to wouldn't make sense if part == None, so it might as well be an Option

ngotchac · 2018-09-07T15:16:32Z

@ordian I updated it so that the default number of snapshot threads is half the number of cpus

ordian

I guess the case of num_cpus::get() / 2 == 0 will be handled properly in snapshot_config fn, so it's fine.
I'm also ok with Cargo.lock updates as long as tests pass, since we need to update dependencies from time to time anyway.

cheme

LGTM.

It could be good to have a comment somewhere indicating that it is no use to put more than 16 threads (maybe in the help or next to the two global variables).

* Add Progress to Snapshot Secondary chunks creation * Use half of CPUs to multithread snapshot creation * Use env var to define number of threads * info to debug logs * Add Snapshot threads as CLI option * Randomize chunks per thread * Remove randomness, add debugging * Add warning * Add tracing * Use parity-common fix seek branch * Fix log * Fix tests * Fix tests * PR Grumbles * PR Grumble II * Update Cargo.lock * PR Grumbles * Default snapshot threads to half number of CPUs * Fix default snapshot threads // min 1

* parity-version: mark 2.1.0 track beta * ci: update branch version references * docker: release master to latest * Fix checkpointing when creating contract failed (#9514) * ci: fix json docs generation (#9515) * fix typo in version string (#9516) * Update patricia trie to 0.2.2 crates. Default dependencies on minor version only. * Putting back ethereum tests to the right commit * Enable all Constantinople hard fork changes in constantinople_test.json (#9505) * Enable all Constantinople hard fork changes in constantinople_test.json * Address grumbles * Remove EIP-210 activation * 8m -> 5m * Temporarily add back eip210 transition so we can get test passed * Add eip210_test and remove eip210 transition from const_test * In create memory calculation is the same for create2 because the additional parameter was popped before. (#9522) * deps: bump fs-swap and kvdb-rocksdb * Multithreaded snapshot creation (#9239) * Add Progress to Snapshot Secondary chunks creation * Use half of CPUs to multithread snapshot creation * Use env var to define number of threads * info to debug logs * Add Snapshot threads as CLI option * Randomize chunks per thread * Remove randomness, add debugging * Add warning * Add tracing * Use parity-common fix seek branch * Fix log * Fix tests * Fix tests * PR Grumbles * PR Grumble II * Update Cargo.lock * PR Grumbles * Default snapshot threads to half number of CPUs * Fix default snapshot threads // min 1 * correct before_script for nightly build versions (#9543) - fix gitlab array of strings syntax error - get proper commit id - avoid colon in stings * Remove initial token for WS. (#9545) * version: mark release critical * ci: fix rpc docs generation 2 (#9550) * Improve P2P discovery (#9526) * Add `target` to Rust traces * network-devp2p: Don't remove discovery peer in main sync * network-p2p: Refresh discovery more often * Update Peer discovery protocol * Run discovery more often when not enough nodes connected * Start the first discovery early * Update fast discovery rate * Fix tests * Fix `ping` tests * Fixing remote Node address ; adding PingPong round * Fix tests: update new +1 PingPong round * Increase slow Discovery rate Check in flight FindNode before pings * Add `deprecated` to deprecated_echo_hash * Refactor `discovery_round` branching * net_version caches network_id to avoid redundant aquire of sync read lock (#9544) * net_version caches network_id to avoid redundant aquire of sync read lock, #8746 * use lower_hex display formatting for net_peerCount rpc method * Increase Gas-floor-target and Gas Cap (#9564) + Gas-floor-target increased to 8M by default + Gas-cap increased to 10M by default * Revert to old parity-tokio-ipc. * Downgrade named pipes.

ngotchac added 13 commits July 24, 2018 13:25

Add Progress to Snapshot Secondary chunks creation

f432170

Use half of CPUs to multithread snapshot creation

4418342

Use env var to define number of threads

422d214

info to debug logs

5242cd5

Add Snapshot threads as CLI option

9d44c47

Randomize chunks per thread

8f3ece7

Remove randomness, add debugging

d278cb9

Add warning

4ee58d2

Add tracing

db4f5d5

Use parity-common fix seek branch

603d265

Fix log

73b8858

Fix tests

4c68b1c

Fix tests

680cef5

ngotchac added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Jul 27, 2018

ngotchac requested a review from rphmeier July 27, 2018 12:23

Merge branch 'master' into ng-multithread-snapshot-creation

17f49d9

5chdn added this to the 2.1 milestone Jul 27, 2018

ngotchac requested a review from ordian July 31, 2018 10:49

Merge branch 'master' into ng-multithread-snapshot-creation

14eda40

ordian approved these changes Aug 3, 2018

View reviewed changes

ordian added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Aug 3, 2018

ngotchac added 2 commits August 22, 2018 14:00

Merge branch 'master' into ng-multithread-snapshot-creation

cd5e93d

PR Grumbles

09a9076

5chdn added A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Sep 6, 2018

ngotchac added 5 commits September 7, 2018 16:04

Merge branch 'master' into ng-multithread-snapshot-creation

7670562

Merge branch 'master' into ng-multithread-snapshot-creation

4c8e900

Update Cargo.lock

8dff8f1

PR Grumbles

3763dd0

Default snapshot threads to half number of CPUs

087468a

ordian approved these changes Sep 7, 2018

View reviewed changes

ordian added A8-looksgood 🦄 Pull request is reviewed well. and removed A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. labels Sep 7, 2018

Fix default snapshot threads // min 1

ceabd5c

5chdn modified the milestones: 2.1, 2.2 Sep 11, 2018

5chdn added B0-patchthis 🕷 B9-blocker 🚧 This pull request blocks the next release from happening. Use only in extreme cases. labels Sep 12, 2018

ordian added A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Sep 12, 2018

5chdn mentioned this pull request Sep 12, 2018

Backports for 2.1.0 beta #9518

Merged

21 tasks

cheme approved these changes Sep 12, 2018

View reviewed changes

Merge branch 'master' into ng-multithread-snapshot-creation

1610154

5chdn added the B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. label Sep 13, 2018

Merge branch 'master' into ng-multithread-snapshot-creation

d8c10b7

ordian added A8-looksgood 🦄 Pull request is reviewed well. and removed A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. labels Sep 13, 2018

5chdn merged commit 4ddd69c into master Sep 13, 2018

5chdn deleted the ng-multithread-snapshot-creation branch September 13, 2018 10:58

ordian mentioned this pull request Jun 17, 2019

Use fewer threads for snapshotting #10752

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multithreaded snapshot creation #9239

Multithreaded snapshot creation #9239

ngotchac commented Jul 27, 2018 •

edited

Loading

rphmeier commented Jul 27, 2018

ngotchac commented Jul 27, 2018 •

edited

Loading

rphmeier commented Jul 27, 2018

ngotchac commented Jul 27, 2018

ordian commented Jul 31, 2018 •

edited

Loading

ordian left a comment

ordian Aug 3, 2018

ordian Aug 3, 2018 •

edited

Loading

ordian Aug 3, 2018 •

edited

Loading

ngotchac Aug 22, 2018

ordian Aug 22, 2018

ngotchac Aug 22, 2018

ordian Aug 3, 2018

This comment was marked as resolved.

ordian Aug 3, 2018

ordian Aug 3, 2018

ngotchac Aug 22, 2018

ngotchac commented Sep 7, 2018

ordian left a comment

cheme left a comment •

edited

Loading

Multithreaded snapshot creation #9239

Multithreaded snapshot creation #9239

Conversation

ngotchac commented Jul 27, 2018 • edited Loading

rphmeier commented Jul 27, 2018

ngotchac commented Jul 27, 2018 • edited Loading

rphmeier commented Jul 27, 2018

ngotchac commented Jul 27, 2018

ordian commented Jul 31, 2018 • edited Loading

ordian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ordian Aug 3, 2018 • edited Loading

Choose a reason for hiding this comment

ordian Aug 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngotchac commented Sep 7, 2018

ordian left a comment

Choose a reason for hiding this comment

cheme left a comment • edited Loading

Choose a reason for hiding this comment

ngotchac commented Jul 27, 2018 •

edited

Loading

ngotchac commented Jul 27, 2018 •

edited

Loading

ordian commented Jul 31, 2018 •

edited

Loading

ordian Aug 3, 2018 •

edited

Loading

ordian Aug 3, 2018 •

edited

Loading

cheme left a comment •

edited

Loading