Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proof: verify keys, pre-state and post-state #299

Open
wants to merge 3 commits into
base: kaustinen-with-shapella
Choose a base branch
from

Conversation

jsign
Copy link
Collaborator

@jsign jsign commented Oct 26, 2023

This PR improves the proof verification code in many ways:

  • We now check that the claimed keys in the proof actually match what we computed during the EVM block execution using the access witness.
  • Same with the pre-state values, which we weren't checking.
  • The post-state values validation now doesn't require building the post-state tree. What we do is use trie.VerkleTrie APIs to collect all tree-writes while StateDB dumps the dirty objects into the tree to calculate the new root. We then use these tree writes to figure which are the correct post-state values.

More about each point in comments.

TODO:

  • Extra test doing a proof gen+verif replaying of existing Kaustinen from genesis.

Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
chainreader := &generatedLinearChainReader{
config: config,
// GenerateVerkleChain should only be called with the genesis block
// as parent.
genesis: parent,
chain: blocks,
}
genblock := func(i int, parent *types.Block, statedb *state.StateDB) (*types.Block, types.Receipts) {
genblock := func(i int, parent *types.Block, statedb *state.StateDB) (*types.Block, types.Receipts, [][]byte, [][]byte, [][]byte) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now genblock will return some extra helpers that the test will use:

  • computedKeys: the ordered keys collected in the witness during block execution.
  • computedPreStateValues: the pre-state values from the tree.
  • computedPostStateValues: we use the collected written values in the tree to calculate what we think are the correct post-state values.

The test will use these values to verify the proof in the block.
To be clear: these three returned values are trusted data since from the perspective of the client they come from his/her own computation. These will be compared against the block proof which, in theory, is not trusted.

Now... it's true that in the case of these test, the block proof is generated by the same client that is verifying the proof. So, for the computedKeys and computedPreStateValues seems pretty clear that they should match. What's more interesting is computedPostStateValues. For this, the verifying is using the collected tree writes, and the prover is using the tree directly (i.e: in FinalizeAndAssemble(...)).

The reality is that we should also make the prover to use these collected values, so the proving calculation of post-state values is faster.

But I'd prefer to do that in another PR since, if I do that now, this test won't be actually "testing" much really since both the prover and verifier would be doing the same calculation for post state values.

Comment on lines +416 to +418
// Get the keys collected in the witness in the witness.
computedKeys := statedb.Witness().Keys()
sort.Slice(computedKeys, func(i, j int) bool { return bytes.Compare(computedKeys[i], computedKeys[j]) < 0 })
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing weird.

Comment on lines +420 to +427
// Get the pre-state values from the database.
computedPreStateValues := make([][]byte, len(computedKeys))
for i := range computedKeys {
computedPreStateValues[i], err = preStateTree.GetWithHashedKey(computedKeys[i])
if err != nil {
panic(err)
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing weird.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing weird except the same work is more or less already done in GetProofItems. But yeah, I don't really want to expose it all and make the interface even more encumbered while we figure out what the best approach to do this is.

Comment on lines +441 to +453
computedPostStateValues := make([][]byte, len(computedKeys))
treeWrites := statedb.GetTrie().(*trie.VerkleTrie).GetTreeWrites()
for i := range computedKeys {
treeWrittenValue, ok := treeWrites[string(computedKeys[i])]
// If we didn't tracked this key, it means it was never written to the post-state tree,
// thus the newValue must be `nil`. (i.e: same as currentValue)
// Additionally, if we wrote to this key but it has the same pre-state value, then must
// also be `nil`.
if !ok || bytes.Equal(treeWrittenValue, computedPreStateValues[i]) {
continue
}
computedPostStateValues[i] = treeWrittenValue
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that here, we call a new GetTreeWrites() from the tree.
This will return all the detected tree writes as a map[string][]byte (i.e: map[treeKey]writtenTreeValue).

Now, remember that just because some key was written in the tree doesn't mean that it actually mutated the tree (what I explained in Matrix). To solve that, we use both computedKeys and computedPreStateValues to construct the expected post-values list.

Note that this is safe, since both computedKeys and computedPreStateValues is data that the node has calculated itself (i.e: it is not relying on block proof stuff, which isn't trusted).

Comment on lines +523 to +527
for i := 1; i < len(proofs); i++ {
if err := trie.DeserializeAndVerifyVerkleProof(proofs[i], statediff[i], chain[i-1].Root().Bytes(), computedKeys[i], computedPreStateValues[i], computedPostStateValues[i]); err != nil {
t.Fatal(err)
}
t.Log("verfied verkle proof")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now DeserializeAndVerifyVerkleProof expects the caller to provide trusted values of:

  • Keys
  • Pre-state
  • Post-state

It will use that to verify the block proof StateDiff is correct.

Also, generalized this proof verification check for all the generated blocks in the test (and not only the block 1). If we extend this tests with more blocks, all should pass.

@@ -39,6 +39,8 @@ type VerkleTrie struct {
db *Database
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the new code of trie/verkle.go that collects tree writes.

Comment on lines +381 to +406
// Verify the provided `statediff` by checking that the keys, pre-values and post-values match exactly
// with the ones provided from the EVM block execution witness.
if len(computedKeys) != len(proof.Keys) {
return fmt.Errorf("witness keys length doesn't match proof keys length: expected %d, got %d", len(computedKeys), len(proof.Keys))
}
for i := range computedKeys {
if !bytes.Equal(computedKeys[i], proof.Keys[i]) {
return fmt.Errorf("witness keys don't match proof keys: expected %x, got %x", computedKeys[i], proof.Keys[i])
}
}
if len(computedPreStateValues) != len(proof.PreValues) {
return fmt.Errorf("witness pre-values length doesn't match proof pre-values length: expected %d, got %d", len(computedPreStateValues), len(proof.PreValues))
}
for i := range computedPreStateValues {
if !bytes.Equal(computedPreStateValues[i], proof.PreValues[i]) {
return fmt.Errorf("witness pre-values don't match proof pre-values: expected %x, got %x", computedPreStateValues[i], proof.PreValues[i])
}
}
if len(computedPostStateValues) != len(proof.PostValues) {
return fmt.Errorf("witness post-values length doesn't match proof post-values length: expected %d, got %d", len(computedPostStateValues), len(proof.PostValues))

}
for i := range computedPostStateValues {
if !bytes.Equal(computedPostStateValues[i], proof.PostValues[i]) {
return fmt.Errorf("witness post-values don't match proof post-values: expected %x, got %x", computedPostStateValues[i], proof.PostValues[i])
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All checks to verify that StateDiff is correct, assisted by the new parameters received in the method.

Comment on lines -377 to -388
// TODO: this is necessary to verify that the post-values are the correct ones.
// But all this can be avoided with a even faster way. The EVM block execution can
// keep track of the written keys, and compare that list with this post-values list.
// This can avoid regenerating the post-tree which is somewhat expensive.
posttree, err := verkle.PostStateTreeFromStateDiff(pretree, statediff)
if err != nil {
return fmt.Errorf("error rebuilding the post-tree from proof: %w", err)
}
regeneratedPostTreeRoot := posttree.Commitment().Bytes()
if !bytes.Equal(regeneratedPostTreeRoot[:], postStateRoot) {
return fmt.Errorf("post tree root mismatch: %x != %x", regeneratedPostTreeRoot, postStateRoot)
}
Copy link
Collaborator Author

@jsign jsign Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this work is gone now since the post-state checking was reduced to a slice comparation (L399-L406 above).

Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
@jsign jsign requested a review from gballet October 28, 2023 12:21
@jsign jsign marked this pull request as ready for review October 28, 2023 12:21
Copy link
Owner

@gballet gballet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to me. I'm not completely sure this is the right way of doing things, although reading your PR has pushed me a lot more into agreeing that it is.

The issue that I have, is that I feel we are repeating a lot of the work we did with the AccessWitness here, and we can not just replace the access witness with a recording of what locations of the tree were written-to since we need to know what values are accessed at the StateDB level so that we can figure out what gas needs to be charged. So I still think we should capture these writes at the StateDB level, either via the AccessWitness or something else. Performing some of that check during the updateTrie/commit loop should do the trick, because then we know if a value has been written to (deleted also counts as a write) although we would need to (we also need to make sure that writes are actually counted as writes, even if the value doesn't change or if it's 0).

Comment on lines +420 to +427
// Get the pre-state values from the database.
computedPreStateValues := make([][]byte, len(computedKeys))
for i := range computedKeys {
computedPreStateValues[i], err = preStateTree.GetWithHashedKey(computedKeys[i])
if err != nil {
panic(err)
}
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing weird except the same work is more or less already done in GetProofItems. But yeah, I don't really want to expose it all and make the interface even more encumbered while we figure out what the best approach to do this is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants