Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add availability-recovery from systematic chunks #1644

Merged
merged 176 commits into from
May 28, 2024
Merged
Show file tree
Hide file tree
Changes from 94 commits
Commits
Show all changes
176 commits
Select commit Hold shift + click to select a range
051328a
Draft of RecoveryStrategy based on linked enums
alindima Sep 6, 2023
166aae5
add copright header
alindima Sep 8, 2023
518d8fd
fix clippy
alindima Sep 8, 2023
dae34a5
Refactor RecoveryStrategy using dynamic dispatch
alindima Sep 8, 2023
640c2d5
address comments
alindima Sep 12, 2023
ac18bab
Merge branch 'master' into alindima/refactor-availability-recovery-st…
alindima Sep 12, 2023
2269f76
erasure-coding: add algorithm for systematic recovery
alindima Sep 12, 2023
dca9ce4
WIP
alindima Sep 13, 2023
9eeb93a
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 13, 2023
6f58c3a
continue implementation
alindima Sep 14, 2023
a805a85
Fix some tests
alindima Sep 14, 2023
d83bf50
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 14, 2023
2a5d6d5
Merge remote-tracking branch 'origin/master' into alindima/refactor-a…
alindima Sep 14, 2023
f3d407b
replace hashmap of chunks with btreemap
alindima Sep 15, 2023
67841bd
add chunk indices cache to av-recovery
alindima Sep 15, 2023
3810eab
some more improvements
alindima Sep 15, 2023
bf39ba0
don't use the backing group if chunk size query failed
alindima Sep 15, 2023
34408b5
add ImmediateError to chunks recovery strategy
alindima Sep 15, 2023
b68e671
modify and add metrics
alindima Sep 15, 2023
1569e49
more review comments
alindima Sep 18, 2023
d5c32d1
fix test
alindima Sep 18, 2023
cb42bef
rollback to using TryConnect for fetch chunks
alindima Sep 18, 2023
06f00a2
add runtime API knob for shuffling enablement
alindima Sep 18, 2023
2b58895
av-recovery and av-distr: use relay parent from candidate_receipt
alindima Sep 18, 2023
f860b7d
add avail_chunk_shuffling params to HostConfiguration
alindima Sep 18, 2023
b82b2a0
replace BypassAvStore variant with a separate flag
alindima Sep 19, 2023
8585628
move requesting_chunks to the strategy
alindima Sep 19, 2023
2f9b767
Merge remote-tracking branch 'origin/master' into alindima/refactor-a…
alindima Sep 19, 2023
824f7e3
Merge remote-tracking branch 'origin/alindima/refactor-availability-r…
alindima Sep 19, 2023
818d9c0
fix tests
alindima Sep 19, 2023
6b8afcb
add erasure-coding benchmark for systematic recovery
alindima Sep 19, 2023
ce633b4
fix clippy
alindima Sep 20, 2023
c1a5dbf
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 20, 2023
0fb256a
av-distribution is not sending av-recovery messages
alindima Sep 20, 2023
b2c92b3
fix clippy
alindima Sep 20, 2023
3c3371a
add client_features runtime API
alindima Sep 22, 2023
3d82eb9
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 22, 2023
d57396b
rustfmt
alindima Sep 22, 2023
0a46a21
fix test
alindima Sep 22, 2023
e2ce0cd
add client_features to westend, fix fmt and clippy
alindima Sep 22, 2023
cc06a71
try fixing clippy again
alindima Sep 22, 2023
a83b178
fix copy-paste mistake
alindima Sep 26, 2023
6e06d27
fix metrics and logs
alindima Sep 27, 2023
ce10f68
add new shuffling algorithm
alindima Sep 27, 2023
b3ea7c9
add newtype for ChunkIndex
alindima Sep 27, 2023
926bcb0
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 27, 2023
6b6e924
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 27, 2023
f28e01e
add v10 config migration to westend
alindima Sep 27, 2023
21811ef
replace u8 ClientFeatures with u64
alindima Sep 28, 2023
ad43e4a
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Sep 28, 2023
67954e6
some fixes to get clippy to pass
alindima Sep 28, 2023
c93985b
use Pallet::current_storage_version()::put() instead of explicitly
alindima Sep 28, 2023
201b09e
replace RecoveryError::Unavailable with Invalid
alindima Sep 28, 2023
9c51bf2
add tests to availability-distribution
alindima Sep 28, 2023
fabc044
add comma
alindima Sep 28, 2023
8ee0027
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 3, 2023
4283214
fix migration test
alindima Oct 3, 2023
77d5370
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 9, 2023
a9ef2a4
add some more tests
alindima Oct 9, 2023
2035f30
add more tests and small fix
alindima Oct 11, 2023
eef1e14
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 16, 2023
fac014c
bump novelpoly
alindima Oct 16, 2023
7a71f0f
av-recovery: revert to with_chunks_if_pov_large for testing purposes
alindima Oct 16, 2023
6973295
improve migrate_to_v10 test
alindima Oct 16, 2023
1af0657
address some review comments
alindima Oct 16, 2023
38e1361
some more tests
alindima Oct 16, 2023
1d825df
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 16, 2023
1318a21
enable systematic recovery for testing purposes
alindima Oct 16, 2023
a98cdd6
don't request chunks that past strategies deemed not available
alindima Oct 18, 2023
9dc87ae
more tests
alindima Oct 19, 2023
c462e25
fix infinite request loop for network errors.
alindima Oct 19, 2023
fc621f9
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 19, 2023
716606e
add extensive unit tests for av-recovery
alindima Oct 19, 2023
c0b1c6c
fix clippy
alindima Oct 19, 2023
862501b
runtime config: enable AVAILABILITY_CHUNK_SHUFFLING for new chains
alindima Oct 19, 2023
91a52fa
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 24, 2023
8b0d583
use fatality in av-recovery
alindima Oct 24, 2023
25ba4fd
bump SYSTEMATIC_CHUNKS_REQ_RETRY_LIMIT to 2
alindima Oct 24, 2023
850c0a3
rename historical_errors to recorded_errors
alindima Oct 24, 2023
ab4d8d1
deduplicate get_block_number
alindima Oct 24, 2023
d9282ad
address some review comments
alindima Oct 24, 2023
9deec5f
more tests
alindima Oct 24, 2023
8494496
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 25, 2023
5524385
remove unused env_logger from test
alindima Oct 25, 2023
d2babf7
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Oct 31, 2023
13c4253
small metrics and logging improvements
alindima Nov 3, 2023
b8792ff
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 3, 2023
81ebaa8
add more explicit error message on decode
alindima Nov 7, 2023
ef893c3
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 7, 2023
f681e1e
add more metrics for full data requests
alindima Nov 8, 2023
60f0e79
move ChunkIndexCacheRegistry to its own module
alindima Nov 8, 2023
90fdec8
fix test compilation
alindima Nov 8, 2023
68ad6a6
add license header
alindima Nov 8, 2023
4e22ef7
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 8, 2023
a943f25
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 10, 2023
6e34e90
fix cumulus pov_recovery with RPC node
alindima Nov 10, 2023
5feed4a
fix test
alindima Nov 10, 2023
949e732
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 14, 2023
db1225b
add more tests for #2287
alindima Nov 14, 2023
b4a125b
add backers as backup for requesting systematic chunks
alindima Nov 14, 2023
d151626
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 14, 2023
e58a405
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Nov 20, 2023
0906bf0
fix clippy and remove unneeded dep
alindima Nov 20, 2023
0ed37a8
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Dec 12, 2023
2d0b587
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Dec 19, 2023
894880b
update lockfile
alindima Dec 19, 2023
fada5d9
integrate reed-solomon from master git branch
alindima Dec 19, 2023
145097e
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Dec 20, 2023
4233871
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 11, 2024
965c139
Adapt code to new design from RFC
alindima Jan 11, 2024
4530caa
fix av-store tests
alindima Jan 18, 2024
1ee008c
don't allow systematic recovery if chunk mapping is disabled
alindima Jan 18, 2024
46486db
launch systematic recovery during approval voting
alindima Jan 18, 2024
376619e
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 18, 2024
199bc88
add zombienet test for network-level compatibility
alindima Jan 18, 2024
d923269
add systematic recovery to subsystem-bench
alindima Jan 19, 2024
bc22b81
fix clippy
alindima Jan 19, 2024
3edd33b
fix some backing tests
alindima Jan 19, 2024
c9ab410
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 19, 2024
2d2d868
random small fixes
alindima Jan 19, 2024
557e410
try fixing zombienet test
alindima Jan 19, 2024
3df708b
add v2 protocol to cumulus
alindima Jan 19, 2024
e6de96a
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 26, 2024
278028b
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 26, 2024
65c8ecf
fix clippy
alindima Jan 26, 2024
951119e
address some review feedback
alindima Jan 26, 2024
c94e7d5
some more comments
alindima Jan 26, 2024
e4cbf1d
fill prdoc
alindima Jan 26, 2024
2f21035
zombienet tests
alindima Jan 26, 2024
33d6162
try fixing zombienet tests
alindima Jan 29, 2024
ad975a6
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Jan 29, 2024
eddae38
update implementer's guide
alindima Jan 30, 2024
c2f0f23
markdown format
alindima Jan 30, 2024
9b49d40
move erasure_task_tx to the common recovery params
alindima Jan 30, 2024
8f027fb
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 9, 2024
826208f
address some review comments
alindima Feb 9, 2024
d25cc1c
try fixing zombienet tests
alindima Feb 9, 2024
4e958ba
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 9, 2024
dbc27b3
more zombienet
alindima Feb 9, 2024
6aa3b16
zombienet
alindima Feb 12, 2024
bf93b63
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 12, 2024
2316316
fix yaml formatting
alindima Feb 12, 2024
0c57295
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 12, 2024
df8a096
fix import
alindima Feb 12, 2024
486cf64
fix bench
alindima Feb 12, 2024
70e8840
clippy
alindima Feb 12, 2024
ee93f68
fix image name
alindima Feb 12, 2024
8a2931c
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 12, 2024
7d9e21a
try another image for the zombienet test
alindima Feb 12, 2024
9000f5c
Use cumulus image for collator
pepoviola Feb 13, 2024
e2a95c2
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Feb 26, 2024
eeab22a
fix yaml format
alindima Feb 27, 2024
e39ea78
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Mar 11, 2024
a57f5fd
fix merge commit
alindima Mar 11, 2024
6948ea1
fix bench
alindima Mar 11, 2024
91a6180
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima Mar 22, 2024
da66e88
re-add enable-node-feature script
alindima Mar 22, 2024
d20ced9
fix zombienet tests
alindima Mar 22, 2024
3ace495
Merge remote-tracking branch 'origin' into alindima/add-systematic-ch…
alindima May 9, 2024
78a7959
add semver to prdoc
alindima May 10, 2024
da8ca22
try fixing prdoc
alindima May 10, 2024
84ca508
Merge branch 'master' into alindima/add-systematic-chunks-av-recovery
alindima May 13, 2024
5dc877e
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima May 16, 2024
345d896
update lockfile
alindima May 21, 2024
01795b0
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima May 21, 2024
0ed57f8
unify fatality versions
alindima May 21, 2024
8acc2c1
some metrics and logs polishes
alindima May 24, 2024
7ddf494
more details to prdoc
alindima May 24, 2024
5bd966f
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima May 24, 2024
c721152
prdoc fixup
alindima May 24, 2024
6faa1ee
fix
alindima May 24, 2024
25dc880
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima May 27, 2024
5669eec
try to make markdown linter happy
alindima May 27, 2024
99f09e3
use higher glutton PoV sizes for tests
alindima May 27, 2024
ce08e60
bump zombienet version
pepoviola May 27, 2024
fdbc31f
Merge remote-tracking branch 'origin/master' into alindima/add-system…
alindima May 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 47 additions & 31 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ use polkadot_overseer::RuntimeApiSubsystemClient;
use polkadot_primitives::{
async_backing::{AsyncBackingParams, BackingState},
slashing,
vstaging::ClientFeatures,
};
use sc_authority_discovery::{AuthorityDiscovery, Error as AuthorityDiscoveryError};
use sp_api::{ApiError, RuntimeApiInfo};
Expand Down Expand Up @@ -364,6 +365,10 @@ impl RuntimeApiSubsystemClient for BlockChainRpcClient {
) -> Result<Option<BackingState>, ApiError> {
Ok(self.rpc_client.parachain_host_para_backing_state(at, para_id).await?)
}

async fn client_features(&self, at: Hash) -> Result<ClientFeatures, ApiError> {
Ok(self.rpc_client.parachain_host_client_features(at).await?)
}
}

#[async_trait::async_trait]
Expand Down
12 changes: 11 additions & 1 deletion cumulus/client/relay-chain-rpc-interface/src/rpc_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,9 @@ use parity_scale_codec::{Decode, Encode};
use cumulus_primitives_core::{
relay_chain::{
async_backing::{AsyncBackingParams, BackingState},
slashing, BlockNumber, CandidateCommitments, CandidateEvent, CandidateHash,
slashing,
vstaging::ClientFeatures,
BlockNumber, CandidateCommitments, CandidateEvent, CandidateHash,
CommittedCandidateReceipt, CoreState, DisputeState, ExecutorParams, GroupRotationInfo,
Hash as RelayHash, Header as RelayHeader, InboundHrmpMessage, OccupiedCoreAssumption,
PvfCheckStatement, ScrapedOnChainVotes, SessionIndex, SessionInfo, ValidationCode,
Expand Down Expand Up @@ -597,6 +599,14 @@ impl RelayChainRpcClient {
.await
}

pub async fn parachain_host_client_features(
&self,
at: RelayHash,
) -> Result<ClientFeatures, RelayChainError> {
self.call_remote_runtime_function("ParachainHost_client_features", at, None::<()>)
.await
}

pub async fn parachain_host_disabled_validators(
&self,
at: RelayHash,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ pub use pallet_xcm;
pub use xcm::prelude::{AccountId32, WeightLimit};

decl_test_relay_chains! {
#[api_version(8)]
#[api_version(9)]
pub struct Westend {
genesis = westend::genesis(),
on_init = (),
Expand All @@ -62,7 +62,7 @@ decl_test_relay_chains! {
AssetRate: westend_runtime::AssetRate,
}
},
#[api_version(8)]
#[api_version(9)]
pub struct Rococo {
genesis = rococo::genesis(),
on_init = (),
Expand All @@ -78,7 +78,7 @@ decl_test_relay_chains! {
Hrmp: rococo_runtime::Hrmp,
}
},
#[api_version(8)]
#[api_version(9)]
pub struct Wococo {
genesis = rococo::genesis(),
on_init = (),
Expand Down
3 changes: 2 additions & 1 deletion polkadot/erasure-coding/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@ license.workspace = true
[dependencies]
polkadot-primitives = { path = "../primitives" }
polkadot-node-primitives = { package = "polkadot-node-primitives", path = "../node/primitives" }
novelpoly = { package = "reed-solomon-novelpoly", version = "1.0.0" }
novelpoly = { package = "reed-solomon-novelpoly", version = "1.0.2" }
parity-scale-codec = { version = "3.6.1", default-features = false, features = ["std", "derive"] }
sp-core = { path = "../../substrate/primitives/core" }
sp-trie = { path = "../../substrate/primitives/trie" }
thiserror = "1.0.48"

[dev-dependencies]
criterion = { version = "0.4.0", default-features = false, features = ["cargo_bench_support"] }
quickcheck = { version = "1.0.3", default-features = false }

[[bench]]
name = "scaling_with_validators"
Expand Down
6 changes: 5 additions & 1 deletion polkadot/erasure-coding/benches/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ cargo bench
## `scaling_with_validators`

This benchmark evaluates the performance of constructing the chunks and the erasure root from PoV and
reconstructing the PoV from chunks. You can see the results of running this bench on 5950x below.
reconstructing the PoV from chunks (either from systematic chunks or regular chunks).
You can see the results of running this bench on 5950x below (only including recovery from regular chunks).
Interestingly, with `10_000` chunks (validators) its slower than with `50_000` for both construction
and reconstruction.
```
Expand Down Expand Up @@ -37,3 +38,6 @@ reconstruct/10000 time: [496.35 ms 505.17 ms 515.42 ms]
reconstruct/50000 time: [276.56 ms 277.53 ms 278.58 ms]
thrpt: [17.948 MiB/s 18.016 MiB/s 18.079 MiB/s]
```

Results from running on an Apple M2 Pro, systematic recovery is generally 40 times faster than
regular recovery, achieving 1 Gib/s.
37 changes: 33 additions & 4 deletions polkadot/erasure-coding/benches/scaling_with_validators.rs
Original file line number Diff line number Diff line change
Expand Up @@ -53,12 +53,16 @@ fn construct_and_reconstruct_5mb_pov(c: &mut Criterion) {
}
group.finish();

let mut group = c.benchmark_group("reconstruct");
let mut group = c.benchmark_group("reconstruct_regular");
for n_validators in N_VALIDATORS {
let all_chunks = chunks(n_validators, &pov);

let mut c: Vec<_> = all_chunks.iter().enumerate().map(|(i, c)| (&c[..], i)).collect();
let last_chunks = c.split_off((c.len() - 1) * 2 / 3);
let chunks: Vec<_> = all_chunks
.iter()
.enumerate()
.take(polkadot_erasure_coding::recovery_threshold(n_validators).unwrap())
.map(|(i, c)| (&c[..], i))
.collect();

group.throughput(Throughput::Bytes(pov.len() as u64));
group.bench_with_input(
Expand All @@ -67,7 +71,32 @@ fn construct_and_reconstruct_5mb_pov(c: &mut Criterion) {
|b, &n| {
b.iter(|| {
let _pov: Vec<u8> =
polkadot_erasure_coding::reconstruct(n, last_chunks.clone()).unwrap();
polkadot_erasure_coding::reconstruct(n, chunks.clone()).unwrap();
});
},
);
}
group.finish();

let mut group = c.benchmark_group("reconstruct_systematic");
for n_validators in N_VALIDATORS {
let all_chunks = chunks(n_validators, &pov);

let chunks = all_chunks
.iter()
.take(polkadot_erasure_coding::systematic_recovery_threshold(n_validators).unwrap())
.map(|c| &c[..])
.collect::<Vec<_>>();

group.throughput(Throughput::Bytes(pov.len() as u64));
group.bench_with_input(
BenchmarkId::from_parameter(n_validators),
&n_validators,
|b, &n| {
b.iter(|| {
let _pov: Vec<u8> =
polkadot_erasure_coding::reconstruct_from_systematic(n, chunks.clone())
.unwrap();
});
},
);
Expand Down
Loading
Loading