Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

pallet-mmr: generate historical proofs #12324

Merged
merged 9 commits into from
Sep 30, 2022

Conversation

serban300
Copy link
Contributor

@serban300 serban300 commented Sep 21, 2022

Polkadot companion: paritytech/polkadot#6061

Description of the changes

A continuation for #11230

This PR adds a mmr_generateHistoricalBatchProof() MMR RPC method. This method can used to generate a historical MMR proof (a MMR proof for when the best known block is older than the best block of the chain).

Why it's needed

  1. BEEFY light client needs to import explicit commitment + leaf + leaf proof for blocks that are changing validator set. If the chain has progressed since such block, the state of validator-set-change-block may have already been discarded (unless you're using archive node, which we want to avoid) and the call of mmr_generateProof(use-state-at-validator-set-change-block) will obviously fail.
  2. if block B is known to the light client, then we can't simply generate some MMR-related proof using MMR state at block B+10. So e.g. to prove some storage (read: deliver message), we need to generate proof using MMR state at block B. If state is discarded, we'll need to import B+10 first => extra bridge cost.

@serban300 serban300 added A0-please_review Pull request needs code review. B3-apinoteworthy C1-low PR touches the given topic and has a low impact on builders. D2-breaksapi labels Sep 21, 2022
@serban300 serban300 self-assigned this Sep 21, 2022
Comment on lines 112 to 113
fn generate_proof(
&self,
leaf_index: LeafIndex,
at: Option<BlockHash>,
best_known_block: Option<BlockNumber>,
) -> RpcResult<LeafProof<BlockHash>>;
Copy link
Contributor

@acatangiu acatangiu Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change runtime MmrApi and simply add new mmr_generateHistoricalProof API (avoiding a few lines of code duplication is certainly not worth dealing with changing the runtime API version - I'm also not in favor of overloading existing API to also handle "historic" case, makes the API harder to use IMO). Same for RPC API.

Also, I would keep at: Option<BlockHash>, in all runtime api functions for consistency, even if it's redundant for the mmr_generateHistoricalProof API. Calling runtime APIs at different block states is part of the runtime API "contract", I don't think sp_api::impl_runtime_apis! will even work without that param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks ! I'll add mmr_generateHistoricalProof then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change runtime MmrApi and simply add new mmr_generateHistoricalProof API (avoiding a few lines of code duplication is certainly not worth dealing with changing the runtime API version - I'm also not in favor of overloading existing API to also handle "historic" case, makes the API harder to use IMO). Same for RPC API.

I fully agree with @acatangiu on this given the tradeoff cost applicable at the time.
Incorporating yesterday's decision to break this API for the leaf->block transition in #12339 though, I want to revisit this affected design tradeoff while the API will have to change anyway: preserving API version seemed the primary motivation for the code duplication, and that's no longer weighing against this.

Now the tradeoff reduces to
a. keep equivalent code in multiple locations in sync, vs
b. pass an extra None to mmr_generateProof when proving for the best_block

so the balance is different.

A unified API via b. does indeed make the API more verbose when dealing with the best_block only, but given that a light client's best_known_block will typically lag the reference chain's best_block, I expect it's less prone to errors by API consumers if they need to be deliberate here when actually wanting to track the best_block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I still prefer b. And since we're changing the API anyway I was also wondering if we should keep generate_proof(), verify_proof(), and verify_proof_stateless() since all these can be done via batching methods (e.g. generate_batch_proof(), etc). I think it would be best to discuss all these in a separate issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #12391 for continuing this discussion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let’s discuss it separately

@serban300
Copy link
Contributor Author

I pushed an updated, but I didn't get to test it yet. Leaving it as draft.

@serban300
Copy link
Contributor Author

@acatangiu @svyatonik

Sorry, I have a question before moving on with the PR. I just realized that the interface that I started with for generate_historical_batch_proof() probably doesn't make much sense and I was thinking what would be the right one. Right now generate_historical_batch_proof() takes 2 arguments:

leaf_indices: Vec<LeafIndex>,
best_known_block: <T as frame_system::Config>::BlockNumber,

I think it would make sense to either provide block numbers and best known block:

block_numbers: Vec<<T as frame_system::Config>::BlockNumber>,
best_known_block: <T as frame_system::Config>::BlockNumber,

or leaf indices and leaves count:

leaf_indices: Vec<LeafIndex>,
leaves_count: LeafIndex,

At first sight to me it would seem more natural to provide block_numbers and best_known_block. I guess if I were to write a light client I would like to deal only with block numbers, which I probably need to manage anyway and avoid managing leaf indexes and the conversion from block number => leaf index. But since I'm not very familiar with light clients and since generate_batch_proof receives leaf indices I would like to double check this with you. What do you think ?

@svyatonik
Copy link
Contributor

Adrian is the best person to answer here, but imo let's make it similar to other APIs perhaps - i.e. deal with leafs?

@acatangiu
Copy link
Contributor

acatangiu commented Sep 23, 2022

If I were to design the pallet's API now, I would go for using only block numbers and having the pallet internally map those to leaves, but since we're past that point I propose we be consistent with the existing API and stick to LeafIndexes.

I suggest using

leaf_indices: Vec<LeafIndex>,
leaves_count: LeafIndex,

in this PR for consistency, and we can explore switching all APIs to block numbers in a dedicated issue/PR (opened #12339).

Since pallet-mmr is only deployed on Rococo as of yet, I believe we can afford to make such a disruptive change if we decide to.

@serban300
Copy link
Contributor Author

Ok, thanks for the input ! Then I'll use leaves for the moment and I'll follow the discussion on #12339 to see if we can switch to block numbers in the future.

Signed-off-by: Serban Iorga <serban@parity.io>
@serban300 serban300 changed the title [RFC] BEEFY: generate historical proofs BEEFY: generate historical proofs Sep 27, 2022
@serban300
Copy link
Contributor Author

I addressed all the comments so far. This PR is ready for review. Sorry I rebased by mistake and then I had to force push.

@serban300 serban300 marked this pull request as ready for review September 27, 2022 13:09
@serban300 serban300 added B0-silent Changes should not be mentioned in any release notes D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit and removed D2-breaksapi labels Sep 27, 2022
Copy link
Contributor

@acatangiu acatangiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

primitives/merkle-mountain-range/src/lib.rs Outdated Show resolved Hide resolved
frame/merkle-mountain-range/src/lib.rs Outdated Show resolved Hide resolved
frame/merkle-mountain-range/rpc/src/lib.rs Outdated Show resolved Hide resolved
frame/merkle-mountain-range/rpc/src/lib.rs Show resolved Hide resolved
@acatangiu acatangiu changed the title BEEFY: generate historical proofs pallet-mmr: generate historical proofs Sep 27, 2022
serban300 and others added 4 commits September 27, 2022 16:47
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Copy link
Contributor

@Lederstrumpf Lederstrumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, modulo a decision on accounting for update from #12339 & #12345 re. changing leaf_index -> block_number in API.

I extended coverage of the historical batch proof test, which can be grabbed from serban300/substrate@generate_historical_proof...Lederstrumpf:substrate:generate_historical_proof, if you agree with the change.

frame/merkle-mountain-range/src/tests.rs Outdated Show resolved Hide resolved
frame/merkle-mountain-range/src/tests.rs Show resolved Hide resolved
Comment on lines 112 to 113
fn generate_proof(
&self,
leaf_index: LeafIndex,
at: Option<BlockHash>,
best_known_block: Option<BlockNumber>,
) -> RpcResult<LeafProof<BlockHash>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not change runtime MmrApi and simply add new mmr_generateHistoricalProof API (avoiding a few lines of code duplication is certainly not worth dealing with changing the runtime API version - I'm also not in favor of overloading existing API to also handle "historic" case, makes the API harder to use IMO). Same for RPC API.

I fully agree with @acatangiu on this given the tradeoff cost applicable at the time.
Incorporating yesterday's decision to break this API for the leaf->block transition in #12339 though, I want to revisit this affected design tradeoff while the API will have to change anyway: preserving API version seemed the primary motivation for the code duplication, and that's no longer weighing against this.

Now the tradeoff reduces to
a. keep equivalent code in multiple locations in sync, vs
b. pass an extra None to mmr_generateProof when proving for the best_block

so the balance is different.

A unified API via b. does indeed make the API more verbose when dealing with the best_block only, but given that a light client's best_known_block will typically lag the reference chain's best_block, I expect it's less prone to errors by API consumers if they need to be deliberate here when actually wanting to track the best_block.

@serban300
Copy link
Contributor Author

@Lederstrumpf thank you for the improvements to the unit tests. I added your commits in this PR. And I created #12391 to continue the discussion on whether we want to have 2 separate methods generate_batch_proof() and generate_historical_batch_proof(). Could you PTAL when you have time ? Thanks !

Copy link
Contributor

@Lederstrumpf Lederstrumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lederstrumpf thank you for the improvements to the unit tests. I added your commits in this PR. And I created #12391 to continue the discussion on whether we want to have 2 separate methods generate_batch_proof() and generate_historical_batch_proof(). Could you PTAL when you have time ? Thanks !

thanks for creating the issue - lgtm :)

@Lederstrumpf Lederstrumpf merged commit 952030c into paritytech:master Sep 30, 2022
@ordian
Copy link
Member

ordian commented Sep 30, 2022

@Lederstrumpf please use bot merge next time :) It'll take care of merging the companion automatically.

@Lederstrumpf
Copy link
Contributor

@Lederstrumpf please use bot merge next time :) It'll take care of merging the companion automatically.

oops - noted - thanks @ordian!

@ggwpez
Copy link
Member

ggwpez commented Sep 30, 2022

@Lederstrumpf please use bot merge next time :) It'll take care of merging the companion automatically.

Yea... Now we have to fast-track paritytech/polkadot#6061 since it is blocking the CI of other Polkadot MRs 🙈

@Lederstrumpf
Copy link
Contributor

@ggwpez I'm sorry for the trouble that's caused :( won't let it happen again with a companion

ggwpez added a commit that referenced this pull request Sep 30, 2022
ordian added a commit that referenced this pull request Oct 5, 2022
* master: (42 commits)
  Adapt `pallet-contracts` to WeightV2 (#12421)
  Improved election pallet testing (#12327)
  Bump prost to 0.11+ (#12419)
  Use saturating add for alliance::disband witness data (#12418)
  [Fix] Rename VoterBagsList -> VoterList to match pdot (#12416)
  client/beefy: small code improvements (#12414)
  BEEFY: Simplify hashing for pallet-beefy-mmr (#12393)
  Add @koute to `docs/CODEOWNERS` and update stale paths (#12408)
  docs/CODEOWNERS: add @acatangiu as MMR owner (#12406)
  Remove unnecessary Clone trait bounds on CountedStorageMap (#12402)
  Fix `Weight::is_zero` (#12396)
  Beefy on-demand justifications as a custom RequestResponse protocol (#12124)
  Remove contracts RPCs (#12358)
  pallet-mmr: generate historical proofs (#12324)
  unsafe_pruning flag removed (#12385)
  Carry over where clauses defined in Config to Call and Hook (#12388)
  Properly set the max proof size weight on defaults and tests (#12383)
  BEEFY: impl TypeInfo for SignedCommitment (#12382)
  bounding staking: `BoundedElectionProvider` trait (#12362)
  New Pallet: Root offences (#11943)
  ...
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
* BEEFY: generate historical proofs

Signed-off-by: Serban Iorga <serban@parity.io>

* Update frame/merkle-mountain-range/rpc/src/lib.rs

Co-authored-by: Adrian Catangiu <adrian@parity.io>

* Update primitives/merkle-mountain-range/src/lib.rs

Co-authored-by: Adrian Catangiu <adrian@parity.io>

* Update frame/merkle-mountain-range/src/lib.rs

Co-authored-by: Adrian Catangiu <adrian@parity.io>

* cargo fmt

* fix off-by-one in leaves powerset generation

* test all possible mmr sizes for historical proofs

* remove now redundant simple_historical_proof

* cargo fmt

Signed-off-by: Serban Iorga <serban@parity.io>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Co-authored-by: Robert Hambrock <roberthambrock@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants