Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Spec on current cachekv implementation #13977

Merged
merged 10 commits into from
Dec 28, 2022

Conversation

dangush
Copy link
Contributor

@dangush dangush commented Nov 22, 2022

Description

Contributes to: #12986

Adds documentation of the current CacheKVStore implementation, as per phase 1 of the plan outlined in #12986 to improve the SDK's storage layer.


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct docs: prefix in the PR title
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the documentation writing guidelines
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct docs: prefix in the PR title
  • confirmed all author checklist items have been addressed
  • confirmed that this PR only changes documentation
  • reviewed content for consistency
  • reviewed content for thoroughness
  • reviewed content for spelling and grammar
  • tested instructions (if applicable)

@dangush dangush marked this pull request as ready for review November 24, 2022 01:29
@dangush dangush requested a review from a team as a code owner November 24, 2022 01:29
Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this document!!

In a followup PR, we should add concurrency assumptions of the cachekv store. Currently it seems like the memdb used has a mutex and the cachekv store has a mutex which could lead to unforeseen issues

Copy link

@thpani thpani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dangush!

Cool work, especially on describing the iterator! CacheKV is probably the most complex part of the store module.

I've left some comments below, where I think the current explanation can be improved.

In general, @angbrav and I have also been diving into the store module and started writing down our understanding. If you intend to add more content, it would be good to sync!

store/cachekv/README.md Outdated Show resolved Hide resolved
store/cachekv/README.md Outdated Show resolved Hide resolved
store/cachekv/README.md Outdated Show resolved Hide resolved
* Allow iteration over contiguous spans of keys
* Act as a cache, so we don't repeat I/O to disk for reads we've already done
* Note: We actually fail to achieve this for iteration right now
* Note: Need to consider this getting too large and dropping some cached reads
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain what you mean here? In contrast to the inter-block cache, there is no upper bound on the cache size in a CacheKV.

Copy link
Contributor Author

@dangush dangush Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @ValarDragon (who wrote this part) is referring to considering runtime issues that can be mitigated by bounding cache. For example, the complexity of running iteration on any size range of keys right now is tied to the overall size of the cache. The best case here would be to design iteration to run relative to the size of the range, but bounding cache size may be needed / a consideration also.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, but the current use of CacheKV to scope transactions in memory until writing back to the underlying IAVL doesn't really allow for bounding cache size, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm aware of, no.

store/cachekv/README.md Outdated Show resolved Hide resolved
store/cachekv/README.md Show resolved Hide resolved

## Iteration

Efficient iteration over keys in `KVStore` is important for generating Merkle range proofs. Iteration over `CacheKVStore` requires producing all key-value pairs from the underlying `KVStore` while taking into account updated values from the cache.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should say here that iterators range over a key interval [start, end), as it becomes important below.

store/cachekv/README.md Outdated Show resolved Hide resolved
store/cachekv/README.md Outdated Show resolved Hide resolved
store/cachekv/README.md Outdated Show resolved Hide resolved
@tac0turtle
Copy link
Member

@thpani @angbrav is it okay we merge this then tackle needed changes in a follow up pr.

@thpani
Copy link

thpani commented Dec 16, 2022

@tac0turtle I would like to resolve #13977 (comment) first, it might cause confusion if it's merged as-is.

The remaining comments should be easy to address, we can do a follow-up PR but should also be fairly easy to address right here.

Also, keep in mind that we need to sync this with the changes of #13881.

@angbrav
Copy link
Contributor

angbrav commented Dec 16, 2022

@tac0turtle I'd rather resolve the simple issues in this PR and the more complex ones in a different one. But I am also fine with the alternative.

By simple I mean:

More complex ones (a different PR):

store/cachekv/README.md Outdated Show resolved Hide resolved
* Allow iteration over contiguous spans of keys
* Act as a cache, so we don't repeat I/O to disk for reads we've already done
* Note: We actually fail to achieve this for iteration right now
* Note: Need to consider this getting too large and dropping some cached reads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm aware of, no.

@yihuang
Copy link
Collaborator

yihuang commented Dec 17, 2022

FYI, I just did a relatively big refactoring on cachekv: #14350

tac0turtle and others added 2 commits December 28, 2022 11:10
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
@tac0turtle
Copy link
Member

merging this and lets handle the changes in a follow up pr

@tac0turtle tac0turtle enabled auto-merge (squash) December 28, 2022 10:11
@tac0turtle tac0turtle merged commit 741f4ae into cosmos:main Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants