trie: Better error checking for invalid proofs #1373

acolytec3 · 2021-07-21T19:54:00Z

Adds a new check to the baseTrie.verifyProof function to verify that the proof trie for the provided key actually contains the value associated with that key and throws if the key doesn't exist in the trie (as described in the function's behavior).

codecov · 2021-07-21T19:55:47Z

Codecov Report

Merging #1373 (bab0d84) into master (7cd22b6) will increase coverage by 0.09%.
The diff coverage is n/a.

Flag	Coverage Δ
block	`86.60% <ø> (ø)`
blockchain	`83.43% <ø> (ø)`
client	`83.94% <ø> (+0.04%)`	⬆️
common	`94.39% <ø> (+0.22%)`	⬆️
devp2p	`83.00% <ø> (+0.34%)`	⬆️
ethash	`82.83% <ø> (ø)`
tx	`88.36% <ø> (ø)`
vm	`79.34% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

acolytec3 · 2021-07-21T20:03:24Z

@ryanio I'm not sure if I'm handling this exactly right or not but the way that the documentation for verifyProof is written, it should throw an error if an invalid proof is provided for a given key. Given my general reading on merkle proofs, is a proof considered invalid for a given trie and key if the proof Trie generated by the verifyProof function doesn't contain the key? That's what our code is doing at present is just returning a null if the value isn't found and there isn't any additional verification done.

ryanio · 2021-07-22T03:54:55Z

thanks for taking this on!

it looks like the library was designed to return null for these keys, check out these tests which are now failing with the new error (line 26):

ethereumjs-monorepo/packages/trie/test/proof.spec.ts

Lines 22 to 46 in 7a2301c

    
           proof = await CheckpointTrie.createProof(trie, Buffer.from('key2bb')) 
        
           val = await CheckpointTrie.verifyProof(trie.root, Buffer.from('key2'), proof) 
        
           // In this case, the proof _happens_ to contain enough nodes to prove `key2` because 
        
           // traversing into `key22` would touch all the same nodes as traversing into `key2` 
        
           t.equal(val, null, 'Expected value at a random key to be null') 
        
           let myKey = Buffer.from('anyrandomkey') 
        
           proof = await CheckpointTrie.createProof(trie, myKey) 
        
           val = await CheckpointTrie.verifyProof(trie.root, myKey, proof) 
        
           t.equal(val, null, 'Expected value to be null') 
        
           myKey = Buffer.from('anothergarbagekey') // should generate a valid proof of null 
        
           proof = await CheckpointTrie.createProof(trie, myKey) 
        
           proof.push(Buffer.from('123456')) // extra nodes are just ignored 
        
           val = await CheckpointTrie.verifyProof(trie.root, myKey, proof) 
        
           t.equal(val, null, 'Expected value to be null') 
        
           await trie.put(Buffer.from('another'), Buffer.from('3498h4riuhgwe')) 
        
           // to fail our proof we can request a proof for one key 
        
           proof = await CheckpointTrie.createProof(trie, Buffer.from('another')) 
        
           // and use that proof on another key 
        
           const result = await CheckpointTrie.verifyProof(trie.root, Buffer.from('key1aa'), proof) 
        
           t.equal(result, null) 
        
           t.end()

the very last equals is interesting, you would think as least a completely invalid proof would throw as that what the typedoc you linked to states, but that's also checked for null (line 45 above)

it is a bit perplexing, i'll see if I can get any thoughts or history/context from @s1na :)

s1na · 2021-07-22T14:01:59Z

is a proof considered invalid for a given trie and key if the proof Trie generated by the verifyProof function doesn't contain the key?

This would be a valid exclusion proof, i.e. a proof that a leaf in the tree is empty which has its own use-cases.

I understand how it'd be useful to distinguish between a valid exclusion proof and an invalid proof and returning null for both is not ideal. But I thought get() would throw an error when it tried to read a key that doesn't exist in leveldb :-? if that doesnt happen then Trie.fromProof might need to be re-written to hash up the proof and check against the root.

acolytec3 · 2021-07-22T14:56:08Z

But I thought get() would throw an error when it tried to read a key that doesn't exist in leveldb :-? if that doesnt happen then Trie.fromProof might need to be re-written to hash up the proof and check against the root.

Now that I've soaked in this a little more, I think I'm starting to understand the logic a little better but I'm sure I'm still missing things. Is this an accurate description of how verifyProof is expected to work.

Create a new proofTrie from just the roothash provided. If the parameter roothash is invalid in the sense that it is a malformed buffer from which a trie can't be constructed, the trie constructor would throw so this is where we get the idea of an invalid proof in the documentation.
Pass that proofTrie and the proof to fromProof to get a new trie where any values in the original trie are updated based on the proof hash through a series of puts. If this throws, we have invalid proof nodes though I'm not sure I see anywhere in the logic of put that it would ever throw.

Here's where I'm just not that deep on merkle proofs/tries but should you be able to construct a trie from any random set of bytes? If so, what constitutes an invalid proof? If we never generate a natural exception when constructing the trie from a proof, doesn't that mean that a prooftrie that doesn't contain the provided key just show evidence of non-existence of that key within the trie and nothing more? What defines it as an invalid proof?

ed255 · 2021-07-22T15:20:09Z

Here's where I'm just not that deep on merkle proofs/tries but should you be able to construct a trie from any random set of bytes? If so, what constitutes an invalid proof? If we never generate a natural exception when constructing the trie from a proof, doesn't that mean that a prooftrie that doesn't contain the provided key just show evidence of non-existence of that key within the trie and nothing more? What defines it as an invalid proof?

I think that verifyProof should be able to throw errors even if fromProof succeeds. To me the definition of a valid proof for a given key, is a set of nodes which can be traversed either:

following the complete key reaching a leaf (in which case the leaf value is returned)
following a prefix of the key (which is shorter than the key) reaching a termination (in which case the null value is returned)

An invalid proof for a key would be a set of nodes which you can't traverse with the key or a prefix of the key reaching to a termination (because there are missing nodes). In this case I think the verifyProof should throw an error. This error can't be cought in fromProof, because you could have a valid proof for key A, and then verify it for key B (where B can't be proved with proofOf(A)); with the current implementation, if you try to verify (B, proofOf(A)) you'd get null, but you can't either prove if B belongs to the tree or not.

This is the behaviour that is implemented in go-ethereum: https://github.com/ethereum/go-ethereum/blob/a1f16bc74c7efb593db2982c92222d1e4a201c25/trie/proof.go#L103

ryanio · 2021-07-22T16:08:28Z

thanks all for the discussion, here is the old (pre-refactored) verifyProof which does seem to have additional checks and errors

acolytec3 · 2021-07-22T16:35:58Z

@ryanio That looks like a straight port of geth's code if I'm reading it correctly. Maybe the best course of action is just to bring that back in?

s1na · 2021-07-22T17:12:00Z

Thanks @ryanio for digging up the older version. That's essentially the usual algorithm for verifying a proof, i.e.:

Start from the leaf we're interested in and its sibling. Hash them together
Hash the result of that with the sibling one level higher
Go till the root
Compare root against expected root

Now this implemented version does this in a hacky way. It inserts all the proof nodes in the database, and then tries to get the leaf with its key and the expected root. This essentially verifies the proof because you do sth like this:

Get the expected root via its hash from db. so at this point we know the correct hash of its children
Traverse down via the key (one nibble at a time). If the node with the correct hash doesn't exist in db, it means it wasn't inserted as part of Trie.fromProof. So the proof is either missing something or has an invalid hash for one of the levels.
If we can go all the way to the leaf (be it empty or full) and return that it means all of the proof nodes on the way have been correct

It might be possible to distinguish between invalid and valid exclusion proofs via a minor tweak where get returns an error instead of nil when asked for a node that doesnt exist. Not to say I have anything against bringing back the old version. Just wanted to provide some context.

acolytec3 · 2021-07-23T11:24:49Z

Thanks @ryanio for digging up the older version. That's essentially the usual algorithm for verifying a proof, i.e.:

Start from the leaf we're interested in and its sibling. Hash them together

Hash the result of that with the sibling one level higher

Go till the root

Compare root against expected root

Now this implemented version does this in a hacky way. It inserts all the proof nodes in the database, and then tries to get the leaf with its key and the expected root. This essentially verifies the proof because you do sth like this:

Get the expected root via its hash from db. so at this point we know the correct hash of its children

Traverse down via the key (one nibble at a time). If the node with the correct hash doesn't exist in db, it means it wasn't inserted as part of Trie.fromProof. So the proof is either missing something or has an invalid hash for one of the levels.

If we can go all the way to the leaf (be it empty or full) and return that it means all of the proof nodes on the way have been correct

It might be possible to distinguish between invalid and valid exclusion proofs via a minor tweak where get returns an error instead of nil when asked for a node that doesnt exist. Not to say I have anything against bringing back the old version. Just wanted to provide some context.

Thanks @s1na. I think I have an idea how to resolve leveraging your current implementation.

I think if we insert a throw here and maybe here we will have either hit a branch node where the key/proof don't match. Wouldn't that land us at the end goal of identifying a proof that shows a value's non-existence in the tree?

ryanio · 2021-07-23T18:53:07Z

I think if we insert a throw here and maybe here we will have either hit a branch node where the key/proof don't match. Wouldn't that land us at the end goal of identifying a proof that shows a value's non-existence in the tree?

that seems like it might also help solve #1055 (comment)

acolytec3 · 2021-07-23T20:33:08Z

At the end of the day, if we're going to continue to use the existing logic and not revert back to what @s1na noted as the conventional way for identifying an invalid proof, we have to put some thought into onFound that's buried inside the asyng get that is returning null to verifyProof. There are several different conditions here and I'm still not clear on when not finding a node in onFound indicates a proof of non-existence versus an invalid proof.

The way I'm reading the code, a valid exclusion proof would result in a final state where the keyRemainder.length is 0 and the node value is null. If at any point in traversing the trie we get to node with a null value but we still have a keyRemainder, that should constitute an invalid proof, right?

zmitton · 2021-07-26T16:04:04Z

The use case for validating a proof, is when someone else sends you the proof, and you do not trust them. This software is meant to be run locally, to verify a key-value pair without trusting the source. This software does not have any method of creating an invalid proof, maybe such a function would be useful for understanding the tests. I wrote the tests above, and I agree they are very unclear, because I found that making an invalid proof (and testing many different cases) was somewhat difficult to do.

First I will say that I think the best/easiest solution might be to just implement the fix I described here.

Now let me try to describe how inserting the proof nodes into a tree and then geting them back out, verifies the proof. The verification actually takes place in this put step here. The user's computer is calculating the keccak (as key) of every given proofnode, and placing them in a db. The traversal will start at the root (which the user already supplies). The root is looked up in the db, and the value found will have to be correct (its hash == the root), because all db keys were generated as described above. Here there will likely be a branch node, the branch is picked based on the key, and again this key is looked up in the db, where it must exist (for the proof to be valid) and whose data has already been verified during insert. If erroneous data is supplied if will be given a different key. Therefore, when the Trie get operation is performed, such nodes will be ignored. If a legitimate null proof is supplied, the tree will read and return the null value from one of these cases. But if the trie tries to lookup the next db node's key, and that node does not exist in the db, that means that either the corresponding proofnode was not supplied (by the proover), or that the data for that node was tampered with, which would have caused it to have a different key then expected. Either way, the proof is invalid.

Hope this helps

acolytec3 · 2021-07-26T18:44:10Z

The use case for validating a proof, is when someone else sends you the proof, and you do not trust them. This software is meant to be run locally, to verify a key-value pair without trusting the source. This software does not have any method of creating an invalid proof, maybe such a function would be useful for understanding the tests. I wrote the tests above, and I agree they are very unclear, because I found that making an invalid proof (and testing many different cases) was somewhat difficult to do.

First I will say that I think the best/easiest solution might be to just implement the fix I described here.

Now let me try to describe how inserting the proof nodes into a tree and then geting them back out, verifies the proof. The verification actually takes place in this put step here. The user's computer is calculating the keccak (as key) of every given proofnode, and placing them in a db. The traversal will start at the root (which the user already supplies). The root is looked up in the db, and the value found will have to be correct (its hash == the root), because all db keys were generated as described above. Here there will likely be a branch node, the branch is picked based on the key, and again this key is looked up in the db, where it must exist (for the proof to be valid) and whose data has already been verified during insert. If erroneous data is supplied if will be given a different key. Therefore, when the Trie get operation is performed, such nodes will be ignored. If a legitimate null proof is supplied, the tree will read and return the null value from one of these cases. But if the trie tries to lookup the next db node's key, and that node does not exist in the db, that means that either the corresponding proofnode was not supplied (by the proover), or that the data for that node was tampered with, which would have caused it to have a different key then expected. Either way, the proof is invalid.

Hope this helps

I think I follow your logic but I still have two challenges with it:

There doesn't seem to be a way to distinguish between a truly invalid proof and a proof of non-existence in the object returned by onFound since a couple of the valid proof cases return a null value for the node along with what I believe your'e saying the invalid proof case is. If your logic is sound in determining between valid and invalid proofs, should we throw rather than resolve so as to distinguish between a valid proof of non-existence and an invalid proof?
I think I'm still missing why we're going this route to look for invalid proofs rather than just do what we did before (which is what geth does and would seem to be the accepted way for identifying an invalid proof and also distinguishing that from a proof of non-existence since a proof of non-existence would traverse the set of hashes in the proof and just end with a null value in the node at the end (if I'm following things correctly).

I don't have a strong opinion one way or the other on this and I'm not an expert on merkle proofs (as demonstrated above) so just looking to make sure we account for this situation correctly.

brickpop · 2021-07-27T16:19:22Z

I opened the issue that may have landed this PR #1368

We need to verify an Ethereum Storage Proof submitted by a user, and the reason we picked the library is because the verifyProof method says the following on the Readme:

https://github.com/ethereumjs/ethereumjs-monorepo/blob/master/packages/trie/docs/classes/basetrie.md#verifyproof

Verifies a proof
Throws If proof is found to be invalid

Why should verifyProof not fail if the given proof is wrong?
What's the point of a Storage Proof in general, if fake values cannot be validated?

ryanio · 2021-07-27T18:52:50Z

Why should verifyProof not fail if the given proof is wrong?
What's the point of a Storage Proof in general, if fake values cannot be validated?

@brickpop that was the existing behavior in mpt v2.3.2, there was some refactoring that happened in v3+ that seems to have introduced this which we are trying to fix with this PR.

There seems to be two routes we can go here: one is to revert back to the v2.3.2 logic which matches geth's implementation, or resolve with the current design.

ryanio · 2021-07-28T04:19:01Z

ok I'm working on this a bit late at night but I think I got the gist and introduced a new get param throwWhenNotFound to allow for more strict checking when verifying the proof (15e34bd). also rebased on master and added an entry to changelog. would appreciate some review. (edit: thinking of simplifying the param name to throwIfNotFound)

acolytec3 · 2021-08-03T20:10:50Z

@ryanio Anything outstanding on this one? I'm waiting on the tests to re-run from merging in master but seems like this one might be ready to go or is it considered a breaking change since we added that new optional parameter to the get method?

ryanio

@acolytec3 thanks, LGTM, really love the readme updates for explaining proof of non-existence.

I don't believe it's breaking because it's an optional parameter that defaults to the prior behavior. It is still a little strange to me to add to the Trie.get param though, but it was the only way I could think of. I would be open to other ideas.

I wonder if we should consider a bug fix release with this if invalid proofs are indeed totally broken?

ryanio · 2021-08-03T21:04:10Z

Let's also wait for an approval or any thoughts from Holger before merging in.

holgerd77 · 2021-08-10T07:45:50Z

Might take me some more days until I worked through this here and can do some qualified comment.

gabrocheleau

Looks good, left a few minor comments

packages/trie/README.md

packages/trie/src/baseTrie.ts

Co-authored-by: Gabriel Rocheleau <contact@rockwaterweb.com>

…reumjs-monorepo into fix-verify-proof

packages/trie/package.json

packages/trie/src/baseTrie.ts

holgerd77

LGTM, nice work, thanks everyone for this extensive and clarifying discussion!

Will merge and relatively soon prepare a bugfix release on this.

holgerd77 · 2021-08-19T07:42:06Z

Changes from this PR have been published along merkle-patricia-tree v4.2.1.

trie: Better error checking for invalid proofs

a88e064

acolytec3 added PR state: WIP package: mpt labels Jul 21, 2021

acolytec3 added 2 commits July 22, 2021 11:05

lint fixes

9f0aa20

Merge remote-tracking branch 'origin/master' into fix-verify-proof

fcc5f78

Merge remote-tracking branch 'origin/master' into fix-verify-proof

f654196

ryanio added the type: bug label Jul 23, 2021

brickpop mentioned this pull request Jul 27, 2021

Adding proof of non-existence vocdoni/storage-proofs-eth-js#7

Merged

acolytec3 and others added 4 commits July 27, 2021 21:14

trie: Better error checking for invalid proofs

b9c8e1e

lint fixes

07ea272

add Trie.get param throwWhenNotFound for failed proofs

15e34bd

add changelog entry

a80b894

ryanio force-pushed the fix-verify-proof branch from f654196 to a80b894 Compare July 28, 2021 04:15

trie: Update verifyProof error message

d481e7a

ed255 mentioned this pull request Aug 2, 2021

Adding support for MiniMe tokens vocdoni/storage-proofs-eth-js#8

Merged

acolytec3 and others added 3 commits August 2, 2021 09:42

Merge branch 'master' into fix-verify-proof

9e6795f

touch ups

7e7cf73

Merge branch 'master' into fix-verify-proof

a6d29b1

acolytec3 requested review from ryanio and holgerd77 August 3, 2021 20:12

ryanio previously approved these changes Aug 3, 2021

View reviewed changes

gabrocheleau self-requested a review August 11, 2021 16:42

Merge branch 'master' into fix-verify-proof

10fba45

holgerd77 added PR state: needs review and removed PR state: WIP labels Aug 11, 2021

gabrocheleau previously approved these changes Aug 12, 2021

View reviewed changes

packages/trie/README.md Outdated Show resolved Hide resolved

packages/trie/src/baseTrie.ts Outdated Show resolved Hide resolved

ryanio and others added 2 commits August 12, 2021 17:52

Merge branch 'master' into fix-verify-proof

10a42af

Apply suggestions from code review

ce8eed1

Co-authored-by: Gabriel Rocheleau <contact@rockwaterweb.com>

ryanio dismissed stale reviews from gabrocheleau and themself via ce8eed1 August 13, 2021 00:54

ryanio added 3 commits August 12, 2021 17:54

Merge branch 'master' into fix-verify-proof

4f6f127

Merge branch 'fix-verify-proof' of https://github.com/ethereumjs/ethe…

e225c69

…reumjs-monorepo into fix-verify-proof

re-add return

823b48d

holgerd77 reviewed Aug 16, 2021

View reviewed changes

packages/trie/package.json Show resolved Hide resolved

packages/trie/src/baseTrie.ts Show resolved Hide resolved

acolytec3 and others added 2 commits August 16, 2021 11:10

add dev note on "missing note" error

5dc687e

Merge branch 'master' into fix-verify-proof

bab0d84

holgerd77 approved these changes Aug 17, 2021

View reviewed changes

holgerd77 merged commit 278549f into master Aug 17, 2021

holgerd77 deleted the fix-verify-proof branch August 17, 2021 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trie: Better error checking for invalid proofs #1373

trie: Better error checking for invalid proofs #1373

acolytec3 commented Jul 21, 2021

codecov bot commented Jul 21, 2021 •

edited

Loading

acolytec3 commented Jul 21, 2021

ryanio commented Jul 22, 2021

s1na commented Jul 22, 2021

acolytec3 commented Jul 22, 2021

ed255 commented Jul 22, 2021

ryanio commented Jul 22, 2021

acolytec3 commented Jul 22, 2021

s1na commented Jul 22, 2021

acolytec3 commented Jul 23, 2021 •

edited

Loading

ryanio commented Jul 23, 2021

acolytec3 commented Jul 23, 2021

zmitton commented Jul 26, 2021

acolytec3 commented Jul 26, 2021

brickpop commented Jul 27, 2021 •

edited

Loading

ryanio commented Jul 27, 2021

ryanio commented Jul 28, 2021 •

edited

Loading

acolytec3 commented Aug 3, 2021

ryanio left a comment

ryanio commented Aug 3, 2021

holgerd77 commented Aug 10, 2021

gabrocheleau left a comment

holgerd77 left a comment

holgerd77 commented Aug 19, 2021

trie: Better error checking for invalid proofs #1373

trie: Better error checking for invalid proofs #1373

Conversation

acolytec3 commented Jul 21, 2021

codecov bot commented Jul 21, 2021 • edited Loading

Codecov Report

acolytec3 commented Jul 21, 2021

ryanio commented Jul 22, 2021

s1na commented Jul 22, 2021

acolytec3 commented Jul 22, 2021

ed255 commented Jul 22, 2021

ryanio commented Jul 22, 2021

acolytec3 commented Jul 22, 2021

s1na commented Jul 22, 2021

acolytec3 commented Jul 23, 2021 • edited Loading

ryanio commented Jul 23, 2021

acolytec3 commented Jul 23, 2021

zmitton commented Jul 26, 2021

acolytec3 commented Jul 26, 2021

brickpop commented Jul 27, 2021 • edited Loading

ryanio commented Jul 27, 2021

ryanio commented Jul 28, 2021 • edited Loading

acolytec3 commented Aug 3, 2021

ryanio left a comment

Choose a reason for hiding this comment

ryanio commented Aug 3, 2021

holgerd77 commented Aug 10, 2021

gabrocheleau left a comment

Choose a reason for hiding this comment

holgerd77 left a comment

Choose a reason for hiding this comment

holgerd77 commented Aug 19, 2021

codecov bot commented Jul 21, 2021 •

edited

Loading

acolytec3 commented Jul 23, 2021 •

edited

Loading

brickpop commented Jul 27, 2021 •

edited

Loading

ryanio commented Jul 28, 2021 •

edited

Loading