-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Restart application issue #5579
Conversation
On further thinking, I don't think the sim tests are necessary here. Don't believe they test anything that baseapp doesn't. So ready for review!! |
Also needs CHANGELOG, but i can add that in once people agree on what needs to be added
anything else? |
Co-Authored-By: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
Very interesting, so Query is reading from disk directly, not just via iavl. There was another outstanding issue about queries and db locks... maybe this is related? |
If we were able to query (by height), why can we also not get the commit info by that same height? |
Ready for review again. Manual tests performed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. There are a few small outstanding items to be addressed.
store/iavl/store.go
Outdated
@@ -94,18 +138,41 @@ func (st *Store) Commit() types.CommitID { | |||
panic(err) | |||
} | |||
|
|||
return types.CommitID{ | |||
st.version = version | |||
flushed := st.pruning.FlushVersion(version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AdityaSripal do you have a response to this? Or is this even valid?
…os-sdk into aditya/pruning-fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, Great work with documentation @alexanderbez! thanks for helping
Codecov Report
@@ Coverage Diff @@
## master #5579 +/- ##
==========================================
- Coverage 44.53% 44.51% -0.03%
==========================================
Files 324 324
Lines 24672 24700 +28
==========================================
+ Hits 10988 10995 +7
- Misses 12627 12645 +18
- Partials 1057 1060 +3 |
I just saw this was merged to master. Which has some major breaking changes from v0.38.0. I was expecting this to go into releases/v0.38.0 for a v0.38.1 release. Such a v0.38.1 bugfix is blocking at least 3 chains. I will do a review on the diff now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and seems quite correct to me.
Added a number of relatively minor comments that you may want to address in a smaller follow-up PR.
determines which committed heights are flushed to disk and `SnapshotEvery` determines which of these | ||
heights are kept after pruning. The `IsValid` method should be called whenever using these options. Methods | ||
`SnapshotVersion` and `FlushVersion` accept a version arugment and determine if the version should be | ||
flushed to disk or kept as a snapshot. Note, `KeepRecent` is automatically inferred from the options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for automatically inferring, and less confusion in configs
// If KeepEvery = 1, keepRecent should be 0 since there is no need to keep | ||
// latest version in a in-memory cache. | ||
// | ||
// If KeepEvery > 1, keepRecent should be 1 so that state changes in between |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, that's what 1 does?
I would have though it would have to equal KeepEvery
or something.
But I guess it doesn't if we only want "query last block"
|
||
// Previous flushed version should only be pruned if the previous version is | ||
// not a snapshot version OR if snapshotting is disabled (SnapshotEvery == 0). | ||
if previous != 0 && !st.pruning.SnapshotVersion(previous) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe previous > 0
is "safer".
Though I cannot think of a scenario when this could be negative, still better safe
// not a snapshot version OR if snapshotting is disabled (SnapshotEvery == 0). | ||
if previous != 0 && !st.pruning.SnapshotVersion(previous) { | ||
err := st.tree.DeleteVersion(previous) | ||
if errCause := errors.Cause(err); errCause != nil && errCause != iavl.ErrVersionDoesNotExist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
if err != nil && !iavl.ErrVersionDoesNotExist.Is(err)
https://github.com/cosmos/cosmos-sdk/blob/master/types/errors/errors.go#L175
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
setLatestVersion(batch, version) | ||
batch.Write() | ||
// write CommitInfo to disk only if this version was flushed to disk | ||
if rs.pruningOpts.FlushVersion(version) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much nicer. We update in memory and response every time, and just use this check to keep CommitInfo in sync with the underlying iavl substores
// Otherwise, we query for the commit info from disk. | ||
var commitInfo commitInfo | ||
|
||
if res.Height == rs.lastCommitInfo.Version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so we can always query the last one (guaranteed to be in memory cache via auto-set KeepRecent) and an other one that was flushed to disk. Seems good.
User providing height = 0 leading to "current height" is taken care of elsewhere (in baseapp), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User providing height = 0 leading to "current height" is taken care of elsewhere (in baseapp), right?
Yes!
// PruneEverything defines a pruning strategy where all committed states will | ||
// be deleted, persisting only the current state. | ||
PruneEverything = PruningOptions{ | ||
KeepEvery: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting with this option, you will have significantly more IO than PruneSyncable
, yet much less disk usage.
What about this just keeping one flushed to disk (as now), but only flushing every 10 blocks or so? (just increment KeepEvery
). This would be helpful for nodes that need throughput
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep sound and most "general" strategies as we have now, except allow the CLI/start command to accept manual values for these instead of providing a bunch of other strategies. ie. gaiad start --pruning-keep-every ... --pruning-snapshot-every ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually probably best to set these in app.toml
instead of (or in additon to) the cli.
But I agree with your point, leaving these 3 safe named posibilities, and then allowing the user to take the risk and shoot themself in the foot if they want to do advanced configuration
Description
Fixes the restart issue by making sure that CommitID contains the last hash that was persisted to disk, rather than the hash of the last block that was executed
Improves Pruning flexibility by splitting the old
KeepEvery
parameter into =>KeepEvery
andSnapshotEvery
parameters.The
KeepEvery
parameter defines how often the state gets flushed to disk. TheSnapshotEvery
parameter defines the snapshot blocks that will remain in disk permanently.On each flush, the SDK removes the last flushed state unless it is a snapshot version.
closes: #5570
For contributor use:
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerFor admin use:
WIP
,R4R
,docs
, etc)