Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(pkg/trie/triedb): implement generics, refactor nibbles, revise iterator #4221

Open
wants to merge 4 commits into
base: tim/update-golangci-lint
Choose a base branch
from

Conversation

timwu20
Copy link
Contributor

@timwu20 timwu20 commented Oct 3, 2024

Changes

  • Introduces and integrates a new hash.Hash and hash.Hasher interface constraints into triedb package.
    • This is a local interface that is similar to runtime.Hash and runtime.Hasher already on development.
    • Updates all the tests to use hash.H256 for the hash and runtime.BlakeTwo256 as the hasher.
  • Refactors how we handle nibbles in triedb. Introduces NibbleSlice, Nibbles, and LeftNibbles. Each of these introduced types has their own distinct public interface and are used in different components within triedb.
    • Nibbles is analogous to trie_db::nibble::NibbleSlice. Primarily used to avoid costly slice manipulation and unnecessary copying of underlying bytes. Able to represent different nibble starting points, but not mutable to be able to reconstruct full keys.
    • NibbleSlice is analogous to trie_db::nibble::NibbleVec. Mutable in the sense you can push and pop nibbles as well as truncate. Used to reconstruct full keys.
    • LeftNibbles is analogous to trie_db::nibble::LeftNibbleSlice. Used in proof generation and verification. Since NibbleSlice and NIbbles are right aligned it makes it more expensive to truncate. LeftNibbles left aligns the nibbles which facilitates faster truncation.
  • Revises the iterator to mirror TrieDBRawIterator which is imported by the state machine primitives in substrate. This is embedded in other types to be able to iterate and return expected types. The rawItem introduced that is returned by rawIterator.nextRawItem includes the partial key as NibbleSlice, hash of the node, and an in-memory type that fulfills the codec.EncodedNode interface (Branch or Leaf).

Notes

  • Removes caching and introduces a temporary Cache interface to avoid too many code changes to how we construct TrieDBs and the options we provide it.
  • Future PR will introduce the new caching interface as required from future client db work. The implementation of the cache is actually in substrate and will not be included as part triedb package.
  • Backwards compatibility is now broken given we have revised how we store the nodes in underlying k/v store. We are now inline with the rust trie impl on how we store keys for nodes.
    • We were using an "expanded" view of the nibbles as part of the storage keys for the nodes. An expanded view is essentially where each nibble occupies an entire byte. So keys were 2x longer than we actually needed.
    • We also now prefix hash values based on child to avoid any key collisions. It was highly unlikely we would retrieve the wrong value, but I think it's a prudent decision to prefix it in our impl.

Tests

go test -tags integration github.com/ChainSafe/gossamer

Issues

contributes to #4157

@timwu20 timwu20 changed the title refactor(pkg/trie/triedb) implement generics, refactor nibbles, revise iterator refactor(pkg/trie/triedb): implement generics, refactor nibbles, revise iterator Oct 3, 2024
@timwu20 timwu20 force-pushed the tim/refactor-triedb branch 14 times, most recently from 6759481 to cc55742 Compare October 4, 2024 02:32
@timwu20 timwu20 changed the base branch from development to tim/update-golangci-lint October 4, 2024 02:32
@timwu20 timwu20 force-pushed the tim/refactor-triedb branch 2 times, most recently from 1f89c69 to 895baf8 Compare October 4, 2024 02:56
Copy link
Contributor

@dimartiro dimartiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but overall ok!
Thanks for spending time working on this changes Tim! I tried to keep the TrieDB as simple as possible but it seems that it wasn't enough 💔

@@ -23,11 +24,11 @@ type (
// InlineNode contains bytes of the encoded node data
InlineNode []byte
// HashedNode contains a hash used to lookup in db for encoded node data
HashedNode common.Hash
HashedNode[H any] struct{ Hash H }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Suggested change
HashedNode[H any] struct{ Hash H }
HashedNode[H hash.Hash] struct{ Hash H }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards trying to be succinct and generic as possible per type. I think it's understood that hash.Hash is used exclusively in TrieDB, but it doesn't necessarily mean that we need it to be for HashedNode. If you feel strongly the other way I'm open to changing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm since we are always storing hashes in HashedNode I would say that it will be right to use the hash.Hash instead of any to prevent issues and also being consistent with HashedValue[H hash.Hash]. What do you think? do you see any issue?

}

func NewTrieLookup(db db.DBGetter, hash common.Hash, cache cache.TrieCache, recorder *Recorder) TrieLookup {
return TrieLookup{
func NewTrieLookup[H hash.Hash, Hasher hash.Hasher[H], QueryItem any](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you using the QueryItem? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! 😅 This is part of the caching changes coming next. I'll remove on this branch.

pkg/trie/triedb/lookup.go Show resolved Hide resolved
@@ -143,32 +150,38 @@ func inMemoryFetchedValue(value nodeValue, prefix []byte, db db.DBGetter) ([]byt
}
}

type NodeTypes[H hash.Hash] interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not yet. I think it's good to semantically describe what the Node types are.

@@ -18,20 +19,22 @@ type nodeHandle interface {
}

type (
nodeHandleHash common.Hash
nodeHandleHash[H any] struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Suggested change
nodeHandleHash[H any] struct {
nodeHandleHash[H hash.Hash] struct {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -63,34 +62,34 @@ type (
NewStoredNode struct {
node Node
}
CachedStoredNode struct {
CachedStoredNode[H any] struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not hash.Hash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return n.node
}

// nodeStorage is a struct that contains all the temporal nodes that are stored
// in the trieDB before being written to the backed db
type nodeStorage struct {
type nodeStorage[H any] struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type nodeStorage[H any] struct {
type nodeStorage[H hash.Hash] struct {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timwu20
Copy link
Contributor Author

timwu20 commented Oct 4, 2024

A few comments but overall ok! Thanks for spending time working on this changes Tim! I tried to keep the TrieDB as simple as possible but it seems that it wasn't enough 💔

You did all the heavy lifting @dimartiro! The fact that I have something to integrate with and contribute to is a testament to your hard work. 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants