This repository has been archived by the owner on Nov 15, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
add erasure-coding benches #6308
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
9672254
add erasure-code benches
ordian ec8ffc2
Merge branch 'master' into ao-erasure-coding-benches
ordian 0f71eeb
revert Cargo.lock changes
ordian de62ce7
revert Cargo.lock changes
ordian 366e981
another day, another master merge
ordian File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
### Run benches | ||
``` | ||
$ cd erasure-coding # ensure you are in the right directory | ||
$ cargo bench | ||
``` | ||
|
||
### `scaling_with_validators` | ||
|
||
This benchmark evaluates the performance of constructing the chunks and the erasure root from PoV and | ||
reconstructing the PoV from chunks. You can see the results of running this bench on 5950x below. | ||
Interestingly, with `10_000` chunks (validators) its slower than with `50_000` for both construction | ||
and reconstruction. | ||
``` | ||
construct/200 time: [93.924 ms 94.525 ms 95.214 ms] | ||
thrpt: [52.513 MiB/s 52.896 MiB/s 53.234 MiB/s] | ||
construct/500 time: [111.25 ms 111.52 ms 111.80 ms] | ||
thrpt: [44.721 MiB/s 44.837 MiB/s 44.946 MiB/s] | ||
construct/1000 time: [117.37 ms 118.28 ms 119.21 ms] | ||
thrpt: [41.941 MiB/s 42.273 MiB/s 42.601 MiB/s] | ||
construct/2000 time: [125.05 ms 125.72 ms 126.38 ms] | ||
thrpt: [39.564 MiB/s 39.772 MiB/s 39.983 MiB/s] | ||
construct/10000 time: [270.46 ms 275.11 ms 279.81 ms] | ||
thrpt: [17.869 MiB/s 18.174 MiB/s 18.487 MiB/s] | ||
construct/50000 time: [205.86 ms 209.66 ms 213.64 ms] | ||
thrpt: [23.404 MiB/s 23.848 MiB/s 24.288 MiB/s] | ||
|
||
reconstruct/200 time: [180.73 ms 184.09 ms 187.73 ms] | ||
thrpt: [26.634 MiB/s 27.160 MiB/s 27.666 MiB/s] | ||
reconstruct/500 time: [195.59 ms 198.58 ms 201.76 ms] | ||
thrpt: [24.781 MiB/s 25.179 MiB/s 25.564 MiB/s] | ||
reconstruct/1000 time: [207.92 ms 211.57 ms 215.57 ms] | ||
thrpt: [23.195 MiB/s 23.633 MiB/s 24.048 MiB/s] | ||
reconstruct/2000 time: [218.59 ms 223.68 ms 229.18 ms] | ||
thrpt: [21.817 MiB/s 22.354 MiB/s 22.874 MiB/s] | ||
reconstruct/10000 time: [496.35 ms 505.17 ms 515.42 ms] | ||
thrpt: [9.7008 MiB/s 9.8977 MiB/s 10.074 MiB/s] | ||
reconstruct/50000 time: [276.56 ms 277.53 ms 278.58 ms] | ||
thrpt: [17.948 MiB/s 18.016 MiB/s 18.079 MiB/s] | ||
``` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite interesting. Would be even more to see at which number of validators the tput reverses trend ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AFFT scales like
blocksize * log(validators)
but with the log being discrete, so it'll never reverse trend. It's arithmetic cannot handle more than 16k validators, so we're missing asserts that should kill it long before 50k. I'd expect 50k = 2k here, at least for cache pressure, but there is some counter running up the rest.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll collapse from all those
validators^2
gossip messages long before this poses any issues. We just need enough validators per relay chain to reach 2nd layer scaling, which is only 600 by Bryan Ford's estimates in OmniLedger, but is more like 1000 by Alfonso's estimates. We'd go to 3/4 of a power of 2 to optimize the erasure coding if we can really make 769 work or whatever, but maybe 1500 is a bit large for the gossip.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it's not 2^16, which is 64k? I added an assertion in reconstruct that it decodes to the original PoV and it passes with 50k validators, but panics with
TooManyValidators
error with 70k shards.polkadot/erasure-coding/src/lib.rs
Line 39 in c7e43d6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, at this point (10k+ validators) networking would be the bottleneck, not CPU cost of erasure-coding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops yes you're right, 2^16, not sure why 50k goes faster then. lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a log factor so I guess some unrelated artifact of the benchmark is just overriding this somehow. the erasure coding cannot actually get faster with more validators