Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare Core runtime API for MBMs #13

Merged
merged 19 commits into from
Mar 17, 2024
127 changes: 127 additions & 0 deletions text/0013-prepare-blockbuilder-and-core-runtime-apis-for-mbms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# RFC-0013: Prepare `BlockBuilder` and `Core` runtime API for MBMs

| | |
| --------------- | --------------------------------------------------------------------------- |
| **Start Date** | July 24, 2023 |
| **Description** | Prepare the `BlockBuilder` and `Core` Runtime APIs for Multi-Block-Migrations.|
| **Authors** | Oliver Tale-Yazdi |

## Summary

Introduces breaking changes to the `BlockBuilder` and `Core` runtime APIs and bumps them to version
5 and 7 respectively. A new function `after_inherents` is added to the `BlockBuilder` runtime API
and the `initialize_block` function of `Core` is changed to return an enum to indicate what
extrinsics can be applied.

## Motivation

The original motivation is Multi-Block-Migrations. They require the runtime to be able to tell the
block builder that it should not attempt to include transactions in the current block. Currently,
there is no communication possible between runtime and block builder that could achieve this.
Further, it is necessary to execute runtime logic right after inherent application but still before
transaction inclusion.

This RFC proposes a way of communication between the runtime and the block builder by changing the
`initialize_block` function to return a `ExtrinsicInclusionMode` enum. Additionally, an
`after_inherents` function is introduced that runs after inherent but before transaction
ggwpez marked this conversation as resolved.
Show resolved Hide resolved
application.

## Stakeholders

- Substrate Maintainers: They will have to implement this upstream with all the testing, audit and
maintenance burden.
- Polkadot Runtime developers: They will have to adapt to this breaking change.
- Polkadot Parachain Teams: They also have to adapt to the breaking changes but then eventually have
multi-block migrations available.

## Explanation

The only relevant change is on the node side in the block authoring logic. Any further preparation
for MBMs can happen entirely on the runtime side with the provided primitives.

All block authors MUST respect the `ExtrinsicInclusionMode` that is returned by `initialize_block`.
They MUST always invoke `after_inherents` directly after inherent application. The runtime MAY
Copy link
Contributor

@rphmeier rphmeier Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what parameters does after_inherents take? Is there any way to limit the amount of weight this can spend?

The use-case I'm thinking of is consensus algorithms where the amount of time which can be spent in authoring is variable, e.g. asynchronous backing scenarios where some blocks might be authored in a full 2 seconds and others only in 1 second, due to variable time constraints following from the current length of the backlog.

Currently, block builders have the ability to gauge how heavy transactions are, and to stop authoring after submitting enough transactions.

Block builders should be able to limit the amount of time spent in after_inherents, although probably not arbitrarily.

Perhaps the ExtrinsicInclusionMode should return a minimum amount of weight that must be allotted to the rest of the block, so block builders can decide whether to simply give up authoring or continue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To some extent I'm wondering if it makes sense to flip block-building inside out - in theory we could have a single fn build_block(InherentData, some_total_weight_or_time_limit) and supply new transactions through special host functions instead, so the runtime actually decides whether to pull transactions from the pool, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weight is a runtime concept. How much weight you want to spent is also something that the runtime should decide and not the node. All information that you want to pass between initialize and after inherents, can be stored temporarily in the state and doesn't need to be passed to the function.

Regarding driving the block production from inside the runtime. I think there exist an issue or we at least discussed it at some point. The main problem being a panic in a transaction. We can not unwind the panic. We would may need to spawn a new was instance to ensure block production isn't affected by a panic.

Copy link
Contributor

@rphmeier rphmeier Jul 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much weight you want to spent is also something that the runtime should decide and not the node.

I disagree - there's no good way for the runtime to know how much time is available for authoring in the general case.

As far as I'm aware, MBMs are oriented around doing large migrations in small chunks, especially where execution time is constrained. Having some hint for after_inherents about how much time is available would be useful for that. Though it may need to be an inherent itself

reject a block that violates either of those requirements.

Enum `ExtrinsicInclusionMode` has two variants:
- `AllExtrinsics`: All extrinsics can be applied. It is the default behaviour prior to this RFC.
- `OnlyInherents`: Only inherents SHALL be applied by the block author. This differs from the
current behaviour by omitting transactions.

It is RECOMMENDED that block authors keep transactions in the local transaction pool (if applicable)
for as long as `initialize_block` returns `OnlyInherents`.
Backwards compatibility with the current runtime API SHOULD be implemented on the node side to not
mandate a lockstep update.

## Drawbacks

[…] drawbacks relating to performance, ergonomics, user experience, security, or privacy: None

The only drawback is that the block execution logic becomes more complicated. There is more room for
error.
Downstream developers will also need to adapt their code to this breaking change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good definition of breaking changes? This seems to be opt-in and doesn't break node-side code unless the user's runtime is updated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a 5 line copy&paste diff for the definition of the runtime APIs. It is mentioned in the Integration section here paritytech/polkadot-sdk#1781
Otherwise there should not be anything needed.


## Testing, Security, and Privacy

Compliance of a block author can be tested by adding specific code to the `after_inherents` hook and
checking that it always executes. The new logic of `initialize_block` can be tested by checking that
the block-builder will skip transactions and optional hooks when `Minimal` is returned.

Security: Implementations need to be well-audited before merging.

Privacy: n/a

## Performance, Ergonomics, and Compatibility

### Performance

The performance overhead is minimal in the sense that no clutter was added after fulfilling the
requirements. A slight performance slow-down is expected from now additionally invoking
`after_inherents` once per block.

### Ergonomics

The new interface allows for more extensible runtime logic. In the future, this will be utilized for
multi-block-migrations which should be a huge ergonomic advantage for parachain developers.

### Compatibility

Backwards compatibility can only be considered on the node side since the runtime cannot be
backwards compatible in any way. The advice here is OPTIONAL and outside of the RFC. To not degrade
user experience, it is recommended to ensure that an updated node can still import historic blocks.

## Prior Art and References

The RFC is currently being implemented in
[substrate#14414](https://github.com/paritytech/substrate/pull/14414). Related issues and merge
requests:
- [Simple multi block migrations](https://github.com/paritytech/substrate/pull/14275)
- [Execute a hook after inherent but before
transactions](https://github.com/paritytech/substrate/issues/9210)
- [There is no module hook after inherents and before
transactions](https://github.com/paritytech/substrate/issues/5757)


## Unresolved Questions

~~Please suggest a better name for `BlockExecutiveMode`. We already tried: `RuntimeExecutiveMode`,
`ExtrinsicInclusionMode`. The names of the modes `Normal` and `Minimal` were also called
`AllExtrinsics` and `OnlyInherents`, so if you have naming preferences; please post them.~~
=> Resolved by using `ExtrinsicInclusionMode`.

Is `post_inherents` more consistent instead of `after_inherents`? Then we should change it.

## Future Directions and Related Material

An alternative approach to this is outlined
[here](https://github.com/paritytech/substrate/pull/14279#discussion_r1226289311) by using an
ordering of extrinsics. In this system, all inherents would have negative priority and transactions
positive priority. By then enforcing an order on them, there would be no hard differentiation
between inherent and transaction for the block author anymore. That approach aims more at unifying
the interplay between inherents and transactions, since the problem of communicating between runtime
and block author on whether transactions should be included would not be solved by it. It also needs
to invoke the `after_inherents` hook.

I think it can therefore be done as a future refactor to improve the code clarity and simplify the
runtime logic. This RFC rather tries to prepare the runtime and block authors for a simple solution
to the Multi-block migrations problem.