Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dry-run multi-block migrations #17

Open
liamaharon opened this issue Aug 14, 2023 · 13 comments
Open

Dry-run multi-block migrations #17

liamaharon opened this issue Aug 14, 2023 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@liamaharon
Copy link
Contributor

liamaharon commented Aug 14, 2023

Original issue paritytech/substrate#14291


          @kianenigma am I right to assume we'll want to make it easy to test these with try-runtime-cli? Ideally should be as easy as `on-runtime-upgrade`, maybe make it auto-mine blocks until all multi-block migrations are complete & provide a hook that runs afterwards?

Originally posted by @liamaharon in paritytech/substrate#14275 (comment)

Kian response:

Good to think about it early, and yes I think it is possible with little effort. I believe follow-chain will already do everything needed to test an MBM?

@liamaharon
Copy link
Contributor Author

The inherents issue is fixed, so this is unblocked :)

@liamaharon
Copy link
Contributor Author

liamaharon commented Apr 16, 2024

Broadly, there seem to be two approaches we could take for adding try-runtime testing for MBMs, I'd like to allow for some discussion before choosing one to start work on.

1. Run MBMs in existing try_on_runtime_upgrade logic until complete

This approach would integrate MBMs into the existing try_on_runtime_upgrade logic, so they're tested as part of the existing migration testing logic flow.

Simply, after dry-running the migrations, step would be called until all MBMs are completed.

Pros:

  • Simple and fast to implement
  • Non-breaking change to try-runtime-cli

Cons:

  • Less realistic as MBMs would be progressing without inherents coming in between.

2. Mine new blocks after try_on_runtime_upgrade has executed

In this approach, we'd continue to mine empty blocks after the runtime upgrade, checking the MBM status (maybe needs a new runtime api?) each block to check the state of MBMs.

Pros:

  • More realistic, since inherents will be applied and the block number will progress between each MBM step as it would when run for real

Cons:

  • More complex implementation
  • Requires new runtime api to check MBM status (I think?)
  • Breaking change to try-runtime-cli, and try-runtime-cli will become more complex to use to dry-run migrations because the user will need to specify their chain inherents.

My thoughts

I'm on the fence, but slightly leaning towards option 1.

Although option 2 is more realistic, I can't foresee why MBMs would be built in such a way that inherents should impact their progress. That sounds like it would be an anti-pattern anyway, that would be fine to not allow?

If option 2 did not come with the drawbacks of being significantly more complex to implement, or make try-runtime-cli harder to use for the end-user, I would suggest it. But given the drawbacks, my gut feeling is it's not worth the additional time or complexity to implement and we should proceed with option 1.

@ggwpez
Copy link
Member

ggwpez commented Apr 17, 2024

Yea i think initially we can do option 1, to get something going quick since the first MBMs are being written.

Although option 2 is more realistic, I can't foresee why MBMs would be built in such a way that inherents should impact their progress. That sounds like it would be an anti-pattern anyway, that would be fine to not allow?

"should not" and "do not" is exactly why we need to test it. Streaming some data in from an inherent to a MBM would work i guess - but very non-standard to do.
Would it work to add a --mbm-test-strategy flag or something that can toggle between option 1 and (possibly future) option 2? Not sure how difficult it would be to write it in a general way to have this possibility.

@kianenigma
Copy link
Collaborator

kianenigma commented Apr 17, 2024

My main reservation towards option 1 is that it might miss something like checking the weight of each step. Can we strictly ensure that a dry-run-ed MBM with approach 1 will detect if it is overweight?

If these two are the only options, I would also lean towards 1. I agree that getting something out the door to test MBMs is important and trumps everything else. But, even though I would support a more sophisticated approach, I don't like option 2 per-se, as it is not the right abstraction.

If try-runtime provides a good system to expresses "please dry-run 5 empty blocks for me", which was always part of the plan, testing MBMs is more or less trivial. I think this is what we should aim for, and then testing all types of migrations will be a specialized version of this.

In other words, if try-runtime can create empty blocks for a given, which I believe with the correct inherent extension it can, even the existing try-runtime-upgrade would be rendered deprecated, as it can be tested more realistically by just creating the next empty block.

Even now, I believe the main advantage of try_runtime_upgrade being a custom runtime-api is such that it can bypass block level details such as inherents and just execute the migration code path.

@liamaharon
Copy link
Contributor Author

It seems there is consensus that we should pursue option 2, even if it will take longer.

My first question about implementing it is what is the best way to get the MBM migration status from try-runtime-cli while mining empty blocks?

Initially I was thinking to directly query the event storage, but I don't think I'd be able to decode it because the RuntimeEvent type varies between every runtime, and we don't know what it is.

We also I don't think can query the pallet storage directly, because the key it's under varies across runtimes.

We could introduce a new runtime api for querying MBM status, but that's far less than ideal because it creates non-trivial work for runtime devs to implement.

The approach that seems best would be to modify Executive's try_execute_block methods return value to include information about MBM status. Then, looks like try-runtime-cli can get MBM information from fn it's calling already.

@bkchr @ggwpez @kianenigma let me know if this sounds sensible.

@kianenigma
Copy link
Collaborator

My first question about implementing it is what is the best way to get the MBM migration status from try-runtime-cli while mining empty blocks?

I think this new return type of initailize_block is exactly what is meant for this:

https://paritytech.github.io/polkadot-sdk/master/sp_runtime/enum.ExtrinsicInclusionMode.html

I think if you update this code to respect this new type, you would already have MBM testing done:

async fn produce_next_block<Block: BlockT, HostFns: HostFunctions>(
externalities: &mut TestExternalities<HashingFor<Block>>,
executor: &WasmExecutor<HostFns>,
parent_height: NumberFor<Block>,
parent_hash: Block::Hash,
inherent_provider: &dyn InherentProvider<Err = String>,
previous_block_building_info: Option<(InherentData, Digest)>,
) -> Result<(Block, Option<(InherentData, Digest)>)> {
let (inherent_data_provider, pre_digest) =
inherent_provider.get_inherent_providers_and_pre_digest(previous_block_building_info)?;
let inherent_data = inherent_data_provider
.create_inherent_data()
.await
.map_err(|s| sc_cli::Error::Input(s.to_string()))?;
let digest = Digest { logs: pre_digest };
let header = Block::Header::new(
parent_height + One::one(),
Default::default(),
Default::default(),
parent_hash,
digest.clone(),
);
call::<Block, _>(
externalities,
executor,
"Core_initialize_block",
&header.encode(),
)
.await?;
let extrinsics = dry_call::<Vec<Block::Extrinsic>, Block, _>(
externalities,
executor,
"BlockBuilder_inherent_extrinsics",
&inherent_data.encode(),
)?;
for xt in &extrinsics {
call::<Block, _>(
externalities,
executor,
"BlockBuilder_apply_extrinsic",
&xt.encode(),
)
.await?;
}
let header = dry_call::<Block::Header, Block, _>(
externalities,
executor,
"BlockBuilder_finalize_block",
&[0u8; 0],
)?;
call::<Block, _>(
externalities,
executor,
"BlockBuilder_finalize_block",
&[0u8; 0],
)
.await?;
Ok((
Block::new(header, extrinsics),
Some((inherent_data, digest)),
))
}

@ggwpez
Copy link
Member

ggwpez commented Apr 26, 2024

I think just for PoC, we can also use chopsticks, since it exposes a block-mining RPC and the possibility to decode events.

I think this new return type of initailize_block is exactly what is meant for this:

Kind of, yes. Although there is a hail-mary last resort option, where a failed migration can decide what to do. In that case, one option is ForceUnstuck, and then the chain resumes. So just looking at that could be misinterpreted as success.
It is still better to either 1) parse events or 2) create a Runtime API.

@liamaharon
Copy link
Contributor Author

I think if you update this code to respect this new type, you would already have MBM testing done:

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

I think just for PoC, we can also use chopsticks, since it exposes a block-mining RPC and the possibility to decode events.

I am worried devex will suck if we start requiring two completely different tools to test a runtime upgrade. Also, I'm not sure Chopsticks handles pre/post hooks.

For now proceed with modifying the try_execute_block ret value unless u have some other ideas.

@liamaharon liamaharon self-assigned this Apr 26, 2024
@bkchr
Copy link
Member

bkchr commented Apr 28, 2024

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

What solid MBM status? What do you need to know besides if they are still running?

@liamaharon
Copy link
Contributor Author

Unfortunately as @ggwpez said just checking the ExtrinsicInclusionMode won't give us any solid MBM status.

What solid MBM status? What do you need to know besides if they are still running?

Need to also know whether or not it succeeded.

@bkchr
Copy link
Member

bkchr commented Apr 29, 2024

Hmm? Run post_runtime_upgrade after all steps are finished.

@liamaharon
Copy link
Contributor Author

Hmm? Run post_runtime_upgrade after all steps are finished.

Do you mean some try-runtime feature-gated code that will panic if the MBMs fail?

@bkchr
Copy link
Member

bkchr commented Apr 30, 2024

No I mean as we have for OnRuntimeUpgrade the pre and post runtime upgrade functions. Here we could have something similar, aka a function that we call after the entire MBM is finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants