Prepare PVFs if node is a validator in the next session #4791

AndreiEres · 2024-06-13T16:22:11Z

On every active leaf candidate-validation subsystem checks if the node is the next session authority.
If it is, it fetches backed candidates and prepares unknown PVFs.
We limit number of PVFs per block to not overload subsystem.

paritytech-cicd-pr · 2024-06-17T15:17:26Z

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/6483094

alexggh

First pass overall looks good, left you some comments.

On top of that we need some tests as well that test the following invariants:

The logic doesn't get triggered if we are in active set.
Checks that when we enter that active set no compilation is triggered because we already compiled them.

polkadot/node/core/candidate-validation/src/lib.rs

alexggh · 2024-06-20T13:33:11Z

polkadot/node/core/candidate-validation/src/lib.rs

+							let new_session_index = new_session_index(&mut sender, session_index, leaf.hash).await;
+							if new_session_index.is_some() {
+								session_index = new_session_index;
+								already_prepared_code_hashes.clear();


Why clearing here? Once we prepare it should be there until restart.

I thought the same, but chances are that by this point, the artifact might have been pruned because it had been unused for more than 24 hours. So clearing here is more foolproof than not clearing, but some better solution may also be considered.

I supposed we should start from scratch on every new session. It’s just a mechanism to not overload the pvf host with unnecessary checking.

polkadot/node/core/candidate-validation/src/lib.rs

sandreim · 2024-06-20T14:39:39Z

polkadot/node/core/candidate-validation/src/lib.rs

+		processed_code_hashes.push(code_hash);
+	}
+
+	if let Err(err) = validation_backend.heads_up(active_pvfs).await {


Considering HOST_MESSAGE_QUEUE_SIZE is 10 can this block ?

It could be a problem if we stall the main loop here while we are active validator. Processing of any incoming validation requests will also be delayed by up to 100 (n_cores) validation code decompressions.

ARAIR the whole subsystem has been moved to the blocking pool (#3122)
Yes, it would indeed be good to offload those decompressions to a separate task on a blocking pool; we discussed that. At that point, it seemed like firing cannons at sparrows. But in the context of #5012 it makes much more sense now.
Still, that's out of the scope of this issue. Raised #5071.

Returning to the original question. We don't do any work when we're an active validator. In other cases we allow to prepare only one PVF per block to not load the validator. It's plenty of time to prepare all PVFs during the session if a validator connected in the beginning but don't overload if it connected in the end.

s0me0ne-unkn0wn · 2024-06-20T18:01:34Z

polkadot/node/core/candidate-validation/src/lib.rs

+							let new_session_index = new_session_index(&mut sender, session_index, leaf.hash).await;
+							if new_session_index.is_some() {
+								session_index = new_session_index;
+								already_prepared_code_hashes.clear();


I thought the same, but chances are that by this point, the artifact might have been pruned because it had been unused for more than 24 hours. So clearing here is more foolproof than not clearing, but some better solution may also be considered.

polkadot/node/core/candidate-validation/src/lib.rs

s0me0ne-unkn0wn · 2024-06-20T18:11:05Z

polkadot/node/core/candidate-validation/src/lib.rs

-			);
-			return PreCheckOutcome::Invalid
-		};
+	let executor_params = if let Ok(executor_params) =


That's a good question indeed: What set of executor parameters should be used to compile the PVFs in advance? There may already be something new in pending_config that will become effective in the next session, and all this in-advance compilation will be in vain in that case. That's a corner case, as changes to the executor parameters are extremely rare, so I do not insist it should be addressed in this very PR, but it's something that might probably be considered in the future. Generally, it's not the first time we're having hard times trying to predict next session's executor parameters: #694

It might not be so bad to precompile pvf anyway. Anyway, the validator will have to compile a new one in the next session. Or we can consider a smarter solution?

I'm not 100% sure I'm not missing something, but it seems like we could use pending_config from the configuration pallet to get the executor environment params. That gives us two benefits:

We're always getting params that will be effective in the future session;

We get it directly from storage, thus saving a couple of runtime API calls.

@sandreim WDYT?

Yes, we already know that in session N we are gonna become a parachain validator. So, we need to prepare the artifacts using the same execution params from session N.

I think 1 is preferable as a proper interaface, rather than reading storage directly.

Alternatively, we could could introduce a new runtime API that returns the execution parameters for a given future session index. This should solve future problems with getting the next session executor params in one place.

@sandreim @s0me0ne-unkn0wn I think it can be a good job for an additional PR. What do you think if I move this conversation to an issue and start doing that after we merged this one?

After recalling how it works, I agree. That will require changing the runtime API, putting that into the staging API, then bumping its version, etc. Not an easy change. Raised #5080.

After a conversation with @sandreim:

This concern is valid and not only for just connected validators but for all. See PVF: Consider re-preparing PVFs ahead of time if PendingConfigs changes ExecutorParams #4203

It's OK to solve this issue separately.

@s0me0ne-unkn0wn what do you think?

I'm ok with implementing #4203 in a separate PR which applies to these changes as well.

Getting this one in the current release if we can still make it to backport there would be great. I'd want to wait for validators to upgrade to this fix before bumping Polkadot validator count to 400.

alexggh

Looks good to me, I would also test with an integration test(zombienet), to make sure the changes have the desired effect.

polkadot/zombienet_tests/smoke/0005-precompile-pvf-smoke.js

polkadot/node/core/candidate-validation/src/lib.rs

sandreim · 2024-07-19T13:51:02Z

polkadot/node/core/candidate-validation/src/lib.rs

+			continue;
+		};
+
+		let pvf = match sp_maybe_compressed_blob::decompress(


I think this just keeps decompressing needlessly if we are a validator intermittently: at session 1, then at session 2 we are no longer, but we are again at session 3.

All that was prepared for session 1 will be reset as soon as we go into session 2 (already_prepared_code_hashes is cleared). Is there a better way to check? Like query the PVF subsystem if it has it already ? Also I think the query needs to take into account the executor params which we are preparing with not just code hash.

let's handle it a new PR.

let's handle it a new PR.

Or maybe will not be even needed: #5071 (comment)

I think we still need to polish it. #5071 ticket talks about the other path, where we actually do candidate validation, here we just try to compile artifacts for blocks backed on chain. Given that, I expect decompression should be fast, the only thing concerning here is that we do it for all blocks.

sandreim

Thanks @AndreiEres. LGTM modulo the optimisations that were discussed , but you are already on them in other PRs.

Let's burn this in on our Kusama validators until the release comes out, just to get more data from real world sooner.

s0me0ne-unkn0wn

LGTM 👍

* master: Bump slotmap from 1.0.6 to 1.0.7 (#5096) feat: introduce pallet-parameters to Westend to parameterize inflation (#4938) Bump openssl from 0.10.64 to 0.10.66 (#5107) Bump lycheeverse/lychee-action from 1.9.1 to 1.10.0 (#5094) docs: remove the duplicate word (#5095) Prepare PVFs if node is a validator in the next session (#4791) Update parity publish (#5105)

) Closes paritytech#4324 - On every active leaf candidate-validation subsystem checks if the node is the next session authority. - If it is, it fetches backed candidates and prepares unknown PVFs. - We limit number of PVFs per block to not overload subsystem.

Closes #4324 - On every active leaf candidate-validation subsystem checks if the node is the next session authority. - If it is, it fetches backed candidates and prepares unknown PVFs. - We limit number of PVFs per block to not overload subsystem.

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: Jun Jiang <jasl9187@hotmail.com>

Check new session

404847f

AndreiEres force-pushed the AndreiEres/prepare-pvf branch from 1298e1b to 404847f Compare June 14, 2024 08:10

AndreiEres added 2 commits June 14, 2024 14:14

Check if our node is a validator

0698cb7

Add preparation

c21f1bf

AndreiEres added T0-node This PR/Issue is related to the topic “node”. T8-polkadot This PR/Issue is related to/affects the Polkadot network. labels Jun 17, 2024

AndreiEres added 4 commits June 17, 2024 12:56

Fix clippy

e6cb488

Fix clippy

089874b

Extract

b79be57

Optimize

0d68af6

AndreiEres added 2 commits June 17, 2024 17:24

Rename

eeeaf85

Clippy

f17099e

AndreiEres requested review from s0me0ne-unkn0wn and alexggh June 17, 2024 15:31

AndreiEres changed the title ~~[WIP] Prepare PVFs in advance~~ Prepare PVFs in advance Jun 17, 2024

AndreiEres marked this pull request as ready for review June 17, 2024 15:39

alexggh reviewed Jun 20, 2024

View reviewed changes

sandreim reviewed Jun 20, 2024

View reviewed changes

s0me0ne-unkn0wn reviewed Jun 20, 2024

View reviewed changes

AndreiEres added 10 commits June 21, 2024 10:37

Remove unnecessary clone()

0581f58

Rename

498945d

Use the prepare timeout

096c44a

Exclude present authorities from the condition

1d56fc3

Don't send a message with empty pvfs

27ef03d

Extract

9a947d9

Merge branch 'master' into AndreiEres/prepare-pvf

120162c

Rename

f55e296

Limit preparation

484de24

Rename

35a4fcf

AndreiEres added 2 commits July 5, 2024 15:21

Add logging

d68438a

Fix toml

8a230d9

alexggh approved these changes Jul 8, 2024

View reviewed changes

Add draft for a zombienet test

5d2ee26

pepoviola reviewed Jul 12, 2024

View reviewed changes

polkadot/zombienet_tests/smoke/0005-precompile-pvf-smoke.js Show resolved Hide resolved

Add the test to ci

2c69d80

AndreiEres requested a review from a team as a code owner July 12, 2024 09:51

AndreiEres added 3 commits July 12, 2024 15:38

Update pr doc

2f39ea6

Add a zombie test for PVF preparation

a09e824

Update test

e233fe2

alvicsam approved these changes Jul 17, 2024

View reviewed changes

This was referenced Jul 18, 2024

Reduce storage deposit for parachain PVFs #5012

Open

In-advance PVF preparation should use executor params from pending configuration #5080

Closed

sandreim reviewed Jul 19, 2024

View reviewed changes

Remove todo

7a280ae

sandreim approved these changes Jul 22, 2024

View reviewed changes

s0me0ne-unkn0wn approved these changes Jul 22, 2024

View reviewed changes

Merge branch 'master' into AndreiEres/prepare-pvf

515b92f

AndreiEres added this pull request to the merge queue Jul 22, 2024

Merged via the queue into master with commit 612c1bd Jul 22, 2024
158 of 162 checks passed

AndreiEres deleted the AndreiEres/prepare-pvf branch July 22, 2024 17:04

This was referenced Jul 24, 2024

Do not process already compiled PVFs in the session before entering active set #5125

Closed

Optimise PVF precompilation before entering active set #5090

Closed

Offload PVF code decompression to a separate task on the blocking pool #5071

Closed

ggwpez added a commit that referenced this pull request Aug 9, 2024

Backport #4791 (#5247)

dda8ce3

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: Jun Jiang <jasl9187@hotmail.com>

github-actions bot mentioned this pull request Oct 8, 2024

Update polkadot-sdk from stable2407 to stable2409 moonbeam-foundation/moonbeam#2994

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare PVFs if node is a validator in the next session #4791

Prepare PVFs if node is a validator in the next session #4791

AndreiEres commented Jun 13, 2024 •

edited

Loading

paritytech-cicd-pr commented Jun 17, 2024

alexggh left a comment

alexggh Jun 20, 2024

s0me0ne-unkn0wn Jun 20, 2024

AndreiEres Jun 21, 2024

sandreim Jun 20, 2024

sandreim Jun 20, 2024

s0me0ne-unkn0wn Jul 19, 2024

AndreiEres Jul 19, 2024

s0me0ne-unkn0wn Jun 20, 2024

s0me0ne-unkn0wn Jun 20, 2024

AndreiEres Jun 21, 2024

s0me0ne-unkn0wn Jul 16, 2024

sandreim Jul 16, 2024 •

edited

Loading

AndreiEres Jul 17, 2024

s0me0ne-unkn0wn Jul 19, 2024

AndreiEres Jul 19, 2024

sandreim Jul 19, 2024

alexggh left a comment •

edited

Loading

sandreim Jul 19, 2024

sandreim Jul 22, 2024

s0me0ne-unkn0wn Jul 22, 2024

sandreim Jul 22, 2024

sandreim left a comment

s0me0ne-unkn0wn left a comment

Prepare PVFs if node is a validator in the next session #4791

Prepare PVFs if node is a validator in the next session #4791

Conversation

AndreiEres commented Jun 13, 2024 • edited Loading

paritytech-cicd-pr commented Jun 17, 2024

alexggh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandreim Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexggh left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandreim left a comment

Choose a reason for hiding this comment

s0me0ne-unkn0wn left a comment

Choose a reason for hiding this comment

AndreiEres commented Jun 13, 2024 •

edited

Loading

sandreim Jul 16, 2024 •

edited

Loading

alexggh left a comment •

edited

Loading