short-term fix for para inherent weight overestimation #5082

ordian · 2024-07-19T15:58:55Z

closes #849

Context

For the background on this and the long-term fix, see #849 (comment).

Changes

The weigh files are renamed from runtime_(parachains|common).* to polkadot_runtime_(parachains|common).*. The reason for it is the renaming introduced in Unify dependency aliases #4633. The new weight command and files are generated now include polkadot_ prefix.
The WeightInfo for paras_inherent now includes enter_empty which calculates the cost of processing an empty parachains inherent. This cost is subtracted dynamically when calculating other weights (so the other weights remain the same)

Benefits

See #849 (comment), but TL;DR is that we are not blocked on weights for scaling the number of validators and cores further.

Resolved questions:

why new benchmarks for westend are doing fewer db IOPS?
Is it due polkadot-sdk update (db IOPS diff)?
or the bench setup is no longer valid?
https://github.com/polkadot-fellows/runtimes/blob/7723274a2c5cbb10213379271094d5180716ca7d/relay/polkadot/src/weights/runtime_parachains_paras_inherent.rs#L131-L196
Answer: see background section of inclusion: bench enact_candidate weight #5270

TODOs:

Rerun benchmarks for Rococo and Westend
PRDoc

ordian · 2024-07-22T14:55:19Z

bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=rococo
bot clean

…nto HEAD

…=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

* master: Bump slotmap from 1.0.6 to 1.0.7 (#5096) feat: introduce pallet-parameters to Westend to parameterize inflation (#4938) Bump openssl from 0.10.64 to 0.10.66 (#5107) Bump lycheeverse/lychee-action from 1.9.1 to 1.10.0 (#5094) docs: remove the duplicate word (#5095) Prepare PVFs if node is a validator in the next session (#4791) Update parity publish (#5105)

paritytech-cicd-pr · 2024-07-23T10:37:29Z

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6778934

kianenigma · 2024-07-24T11:51:59Z

What is the long term fix?

* master: (27 commits) Bridges improved tests and nits (#5128) Fix misleading comment about RewardHandler in epm config (#3095) Introduce a workflow updating the wishlist leaderboards (#5085) membership: Restructure pallet into separate files (#4536) Fix after ring-proof api change (#5126) Bump paritytech/review-bot from 2.4.0 to 2.5.0 (#5057) Bump docker/login-action from 3.0.0 to 3.3.0 (#5109) Bump docker/build-push-action from 5.1.0 to 6.5.0 (#5108) Bump peter-evans/create-pull-request from 5.0.0 to 6.1.0 (#5093) Tx Payment: drop ED requirements for tx payments with exchangeable asset (#4488) Remove `pallet-getter` usage from pallet-transaction-payment (#4970) pallet macro: do not generate try-runtime related code when frame-support doesn't have try-runtime. (#5099) fix(chain-spec): ChainSpecBuilder with object as default genesis (#4345) Migrate BEEFY BLS crypto to bls12-381 curve (#4931) Bump clap from 4.5.9 to 4.5.10 in the known_good_semver group (#5120) Use jobserver in wasm-builder to limit concurrency of spawned cargo processes (#4946) include events for voting (#4613) [subsystem-bench] Add mocks for own assignments triggering (#5042) Remove not-audited warning (#5114) hotfix: blockchain/backend: Skip genesis leaf to unblock syncing (#5103) ...

polkadot/runtime/parachains/src/paras_inherent/weights.rs

polkadot/runtime/parachains/src/paras_inherent/benchmarking.rs

polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs

sandreim · 2024-07-29T11:06:13Z

polkadot/runtime/parachains/src/paras_inherent/benchmarking.rs

+		benchmark.bitfields.clear();
+		benchmark.backed_candidates.clear();
+		benchmark.disputes.clear();


Are these really needed if we just called build() on a default BenchBuilder instance?

polkadot/runtime/parachains/src/paras_inherent/benchmarking.rs

polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs

polkadot/runtime/parachains/src/paras_inherent/weights.rs

* master: (51 commits) Remove unused feature gated code from the minimal template (#5237) make polkadot-parachain startup errors pretty (#5214) Coretime auto-renew (#4424) network/strategy: Backoff and ban overloaded peers to avoid submitting the same request multiple times (#5029) Fix frame crate usage doc (#5222) beefy: Tolerate pruned state on runtime API call (#5197) rpc: Enable ChainSpec for polkadot-parachain (#5205) Add an adapter for configuring AssetExchanger (#5130) Replace env_logger with sp_tracing (#5065) Adjust sync templates flow to use new release branch (#5182) litep2p/discovery: Publish authority records with external addresses only (#5176) Run UI tests in CI for some other crates (#5167) Remove `pallet::getter` usage from the pallet-balances (#4967) pallet-timestamp: `UnixTime::now` implementation logs error only if called at genesis (#5055) [CI] Cache try-runtime check (#5179) [Backport] version bumps and the prdocs reordering from stable2407 (#5178) [subsystem-benchmark] Update availability-distribution-regression-bench baseline after recent subsystem changes (#5180) Remove pallet::getter usage from proxy (#4963) Remove pallet::getter macro usage from pallet-election-provider-multi-phase (#4487) Review-bot@2.6.0 (#5177) ...

ordian · 2024-08-07T09:51:48Z

bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=rococo
bot clean

…nto HEAD

* master: Umbrella crate: exclude chain-specific crates (#5173) Bring reference_hardware.json inline with machine used for weights (#5196) Snowbridge on Westend (#5074) Run semver check even when no prdoc (#5189) Export more from sc-service (#5250) Update the wishlist leaderboard script to handle PRs (#5256)

alexggh

Looks good to me, but there is something I don't clearly understand, as far as I can tell we are using fallback_max_validators() in a lot of places, don't we need to adjust those to get realistic results for 1000 validators ?

* master: (35 commits) Fix OurViewChange small race (#5356) Make ticket non-optional and add ensure_successful method to Consideration trait (#5359) [tests] dedup test code, add more tests, improve naming and docs (#5338) Stop running the wishlist workflow on forks (#5297) Migrate foreign assets v3::Location to v4::Location (#4129) Minor clean up (#5284) [Pools] Ensure members can always exit the pool gracefully (#4998) StorageWeightReclaim: set to node pov size if higher (#5281) [Bot] Add prdoc generation (#5331) Small nits found accidentally along the way (#5341) Create subsystem-benchmarks.yml (#5325) Bump libp2p-identity from 0.2.8 to 0.2.9 (#5232) Bump authoring duration for async backing to 2s. (#5195) Fix spelling issues (#5206) Bump the known_good_semver group across 1 directory with 3 updates (#5315) `polkadot-node-core-pvf-common`: Fix test compilation error (#5310) ci: Paused `cmd-action` commenter (#5287) Remove unnecessary mut (#5318) chain-spec: minor clarification on the genesis config patch (#5324) fix av-distribution Jaeger spans mem leak (#5321) ...

* master: Remove redundant minimal template workspace (#5330) approval-distribution: Fix handling of conclude (#5375) More logs in `is_potential_spam` from `dispute-coordinator` (#5252) Fix zombienet bridges test (#5373) Update Readme of the `polkadot` crate (#5326) allow for `u8` to be used as hold/freeze reason (#5348) Moving cargo check for runtimes to GHA (#5340) Update links in the documentation (#5175) fix visibility for `pallet_nfts` types used as call arguments (#3634) Correct some typos in crates' descriptions (#5262) Aura: Ensure we are building on each relay chain fork (#5352) Update Identity pallet README.md (#5183) Bump trie-db from 0.29.0 to 0.29.1 (#5231) [Coretime] Always include UnpaidExecution, not just when revenue is > 0 (#5369) [Pools] fix derivation of pool account (#4999) Upgrade accidentally downgraded deps (#5365) [Pools] Fix issues with member migration to `DelegateStake` (#4822) Unify `no_genesis` check (#5360) [CI] Fix prdoc command (#5358) Beefy: add benchmarks for `report_fork_voting()` (#5188)

tdimitrov

I think for a short term fix this is good. @alexggh do have a point about the 1000 validators but it can be a follow up or part of the proper fix.

ordian · 2024-08-22T22:30:05Z

bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=westend
bot bench polkadot-pallet --pallet=polkadot_runtime_parachains::paras_inherent --runtime=rococo
bot clean

command-bot · 2024-08-22T22:30:30Z

@ordian Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent has finished. Result:

HttpError: Not Found
HttpError: Not Found
    at /app/node_modules/@octokit/request/dist-node/index.js:86:21
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async sendRequestWithRetries (/app/node_modules/octokit-auth-probot/node_modules/@octokit/auth-app/dist-node/index.js:466:12)
    at async Job.doExecute (/app/node_modules/bottleneck/light.js:405:18)

…nto HEAD

…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

command-bot · 2024-08-23T03:39:09Z

@ordian Command "$PIPELINE_SCRIPTS_DIR/commands/bench/bench.sh" --subcommand=pallet --runtime=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7109801 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7109801/artifacts/download.

kianenigma

Empirically, how much is the weight consumed by paras inherent in Polkadot and Kusama RC now, vs. how much will it be when this is deployed? (and which one is closer to reality -- although this is hard to measure 🙈)

ordian · 2024-08-29T09:08:27Z

Empirically, how much is the weight consumed by paras inherent in Polkadot and Kusama RC now, vs. how much will it be when this is deployed? (and which one is closer to reality -- although this is hard to measure 🙈)

I've provided some back-of-the-envelope calculations in this comment: #849 (comment), but TL;DR is that the dominant cost are from enter_bitfields (which we multiply by the number of para validators, which is 500 on Kusama at the moment and will grow to 1k). This number was already reduced in polkadot-fellows/runtimes#433 by 55% and the reasoning is explained in background section of #5270. This PR will further reduce it cost by more than a half (need to rerun the benches on Kusama, but you can see the cost for westend for comparison). Please note, that the cost of enter_bitfields by itself will remain the same, but subtract dynamically the cost of enter_empty. These and other annoyances will be addressed with a long-term refactoring mentioned in the first comment.
However, #5270 will bring some (a lot) of the costs back by properly accounting for e.g. processed messages in enacted candidates.

* master: (39 commits) short-term fix for para inherent weight overestimation (#5082) CI: Add backporting bot (#4795) Fix benchmark failures when using `insecure_zero_ed` flag (#5354) Command bot GHA v2 - /cmd <cmd> (#5457) Remove pallet::getter usage from treasury (#4962) Bump blake2b_simd from 1.0.1 to 1.0.2 (#5404) Bump rustversion from 1.0.14 to 1.0.17 (#5405) Bridge zombienet tests: remove old command (#5434) polkadot-parachain: Add omni-node variant with u64 block number (#5269) Refactor verbose test (#5506) Use umbrella crate for minimal template (#5155) IBP Coretime Polkadot bootnodes (#5499) rpc server: listen to `ipv6 socket` if available and `--experimental-rpc-endpoint` CLI option (#4792) Update approval-voting-regression-bench (#5504) change try-runtime rpc domains (#5443) polkadot-parachain-bin: Remove contracts parachain (#5471) Add feature to allow Aura collator to use full PoV size (#5393) Adding stkd bootnodes (#5470) Make `PendingConfigs` storage item public (#5467) frame-omni-bencher maintenance (#5466) ...

On top of #5082. ## Background Previously, before #3479, we would [include](https://github.com/paritytech/polkadot-sdk/blame/75074952a859f90213ea25257b71ec2189dbcfc1/polkadot/runtime/parachains/src/builder.rs#L508C12-L508C44) the cost enacting the candidate into the cost of processing a single bitfield. [Now](https://github.com/paritytech/polkadot-sdk/blame/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/builder.rs#L529) it is different, although the benchmarks seems to be not-up-to date. Including the cost of enacting a candidate into a processing a single bitfield cost was incorrect, since we multiple that by the number of bitfields we have. Instead, we should separate calculate the cost of processing a single bitfield without enactment, and multiple the cost of enactment by the actual number of processed candidates (which is limited by the number cores, not validators). ## Bench Previously, the weight of `enact_candidate` was calculated manually (without a benchmark) and then neglected: https://github.com/paritytech/polkadot-sdk/blob/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/inclusion/mod.rs#L584 In this PR, we have a benchmark for it and it's based on the number of ump and sent hrmp messages as well as whether the candidate has a runtime upgrade (new_validation_code). The differences from the previous attempt paritytech/polkadot#6929 are that * we don't include the cost of enactment into the cost of processing a backed candidate. The reason for it is that enactment happens not in the same block as backing (typically the next one), since we process bitfields before backing votes. * we don't take into account the size of the runtime upgrade, the benchmark weight doesn't seem to depend much on it, but rather whether there was one or not. Similarly to the previous attempt, we don't account for dmp messages (fixed cost). Also we don't account properly for received hrmp messages (hrmp_watermark) because the cost of it depends on the runtime state and can't be statically deduced in the benchmark (unless we pass the information about channels as benchmark u32 arguments). The total weight cost of processing a parainherent now includes the cost of enactment of each candidate, but we don't do filtering based on that (because we enact after processing bitfields and making other changes to the storage). ## Numbers ``` Reads = 7 + (0 * u) + (3 * h) + (8 * c) Writes = 10 + (1 * u) + (3 * h) + (7 * c) ``` In addition, there is a fixed cost of a few of ms (!) per candidate. This might result a full block slightly overflowing its weight with 200 enacted candidates, which in turn could prevent non-mandatory transactions from being included in a block. Given our modest limits on max ump and hrmp messages: ``` maxUpwardMessageNumPerCandidate: 16 hrmpMaxMessageNumPerCandidate: 10 ``` and the fact that runtime upgrades are can't happen very frequently (`validation_upgrade_cooldown`), we might only go over the limits in case of many disputes. TODOs: - [x] Fix the overweight test - [x] Generate the weights for Westend and Rococo - [x] PRDoc --------- Co-authored-by: command-bot <> Co-authored-by: Alin Dima <alin@parity.io>

temp fix for para inherent weight overestimation

b0fbeec

ordian added T8-polkadot This PR/Issue is related to/affects the Polkadot network. T12-benchmarks This PR/Issue is related to benchmarking and weights. labels Jul 19, 2024

command-bot bot deleted a comment from github-actions bot Jul 22, 2024

command-bot and others added 6 commits July 22, 2024 14:58

Merge branch 'master' of https://github.com/paritytech/polkadot-sdk i…

940c02e

…nto HEAD

".git/.scripts/commands/bench/bench.sh" --subcommand=pallet --runtime…

67c295d

…=rococo --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

".git/.scripts/commands/bench/bench.sh" --subcommand=pallet --runtime…

08dc2cc

…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

fixup

f920d3b

another fixup

42d0302

kianenigma added the T2-pallets This PR/Issue is related to a particular pallet. label Jul 24, 2024

ordian mentioned this pull request Jul 24, 2024

paraInclusion is massively overestimating its weight #849

Closed

ordian added 2 commits July 25, 2024 10:23

fixup

f4d2eee

alexggh reviewed Jul 26, 2024

View reviewed changes

polkadot/runtime/parachains/src/paras_inherent/weights.rs Show resolved Hide resolved

polkadot/runtime/parachains/src/paras_inherent/benchmarking.rs Show resolved Hide resolved

ordian added 4 commits July 26, 2024 14:07

fix a typo

be07fcb

polkadot bench: make bitfields full in builder

9fe7a6b

rename some weight files

ce6d0ac

some more weight renames

64b367c

sandreim reviewed Jul 29, 2024

View reviewed changes

ordian added 3 commits August 6, 2024 15:33

fmt

c8b2c30

rename rococo weights too

e9258b7

ordian mentioned this pull request Aug 7, 2024

inclusion: bench enact_candidate weight #5270

Merged

3 tasks

command-bot bot deleted a comment from github-actions bot Aug 7, 2024

Merge branch 'master' of https://github.com/paritytech/polkadot-sdk i…

6133e83

…nto HEAD

ordian added 2 commits August 7, 2024 17:54

restore the comment

c4e0989

ordian marked this pull request as ready for review August 7, 2024 16:15

alexggh reviewed Aug 8, 2024

View reviewed changes

sandreim mentioned this pull request Aug 12, 2024

Extrinsic accepted by the node but not included into the block #5313

Open

ordian added 2 commits August 14, 2024 16:57

tdimitrov approved these changes Aug 20, 2024

View reviewed changes

sandreim approved these changes Aug 22, 2024

View reviewed changes

ordian and others added 5 commits August 22, 2024 16:37

Merge branch 'master' into ao-fix-parainclusion-weight-overestimation

90e4eaf

adjust some of the runtime builder params

b5873d4

adjust backing bench

05d90cb

rename validaty to backing votes

bb75fb5

fix a typo

73d3e6e

command-bot added 2 commits August 22, 2024 22:32

Merge branch 'master' of https://github.com/paritytech/polkadot-sdk i…

1ca7b53

…nto HEAD

".git/.scripts/commands/bench/bench.sh" --subcommand=pallet --runtime…

18c3e03

…=westend --target_dir=polkadot --pallet=polkadot_runtime_parachains::paras_inherent

Merge branch 'master' into ao-fix-parainclusion-weight-overestimation

46c7965

kianenigma approved these changes Aug 23, 2024

View reviewed changes

AndreiEres approved these changes Aug 28, 2024

View reviewed changes

ordian added this pull request to the merge queue Aug 29, 2024

Merged via the queue into master with commit cc7ebe0 Aug 29, 2024
189 of 191 checks passed

ordian deleted the ao-fix-parainclusion-weight-overestimation branch August 29, 2024 09:16

ggwpez mentioned this pull request Sep 9, 2024

Parent issue for stable2409 LTS release #5583

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

short-term fix for para inherent weight overestimation #5082

short-term fix for para inherent weight overestimation #5082

ordian commented Jul 19, 2024 •

edited

Loading

ordian commented Jul 22, 2024

paritytech-cicd-pr commented Jul 23, 2024

kianenigma commented Jul 24, 2024

sandreim Jul 29, 2024

ordian commented Aug 7, 2024

alexggh left a comment

tdimitrov left a comment

ordian commented Aug 22, 2024

command-bot bot commented Aug 22, 2024

command-bot bot commented Aug 23, 2024

kianenigma left a comment

ordian commented Aug 29, 2024

short-term fix for para inherent weight overestimation #5082

short-term fix for para inherent weight overestimation #5082

Conversation

ordian commented Jul 19, 2024 • edited Loading

Context

Changes

Benefits

ordian commented Jul 22, 2024

paritytech-cicd-pr commented Jul 23, 2024

kianenigma commented Jul 24, 2024

sandreim Jul 29, 2024

Choose a reason for hiding this comment

ordian commented Aug 7, 2024

alexggh left a comment

Choose a reason for hiding this comment

tdimitrov left a comment

Choose a reason for hiding this comment

ordian commented Aug 22, 2024

command-bot bot commented Aug 22, 2024

command-bot bot commented Aug 23, 2024

kianenigma left a comment

Choose a reason for hiding this comment

ordian commented Aug 29, 2024

ordian commented Jul 19, 2024 •

edited

Loading