Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State sync support #5803

Closed
wants to merge 79 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
e75f206
Use local iavl module
erikgrinaker Mar 4, 2020
c6201ba
Added initial snapshot settings
erikgrinaker Mar 4, 2020
bd2af17
Initial functional snapshot/restore API
erikgrinaker Mar 5, 2020
2424a7d
Added compression and chunking
erikgrinaker Mar 5, 2020
383cf94
Code cleanups
erikgrinaker Mar 5, 2020
aabc422
Added benchmarks
erikgrinaker Mar 5, 2020
dcc7673
More benchmarks
erikgrinaker Mar 5, 2020
780dae5
Buffer snapshot writers
erikgrinaker Mar 9, 2020
e91f273
Minor tweaks
erikgrinaker Mar 11, 2020
6bcf763
Type fix
erikgrinaker Mar 11, 2020
ef46535
Ignore caches during export
erikgrinaker Mar 13, 2020
a939897
Use local tm-db as well
erikgrinaker Mar 13, 2020
4585f8e
Initial snapshot store
erikgrinaker Mar 13, 2020
88e5355
Simplified Snapshotter interface
erikgrinaker Mar 13, 2020
6aed0b1
Split chunk writer and reader to separate file
erikgrinaker Mar 13, 2020
d7a6dbe
Cleaned up multistore snapshot/restore
erikgrinaker Mar 13, 2020
672db15
Improved snapshotting
erikgrinaker Mar 13, 2020
cd1b2d8
Properly close exporters and importers
erikgrinaker Mar 13, 2020
11de242
Added snapshot loading
erikgrinaker Mar 13, 2020
11fb197
Added snapshot pruning
erikgrinaker Mar 13, 2020
ad61eb1
Added snapshot listing
erikgrinaker Mar 13, 2020
6eb2ffe
Use prefix db for snapshot store
erikgrinaker Mar 13, 2020
dd76cb7
Added auxiliary snapshot function for BaseApp
erikgrinaker Mar 17, 2020
bf8acad
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 17, 2020
a257ff3
go.mod: remove local tm-db.
erikgrinaker Mar 17, 2020
4251e06
Moved rootmulti snapshot contents to separate store/types
erikgrinaker Mar 17, 2020
31f8eee
Moved snapshot store to separate package
erikgrinaker Mar 17, 2020
46a3941
Added format parameter for Snapshotter interface
erikgrinaker Mar 18, 2020
c7a7513
Removed unused snapshotFormat variable
erikgrinaker Mar 18, 2020
bf54794
Don't set up a snapshot store automatically
erikgrinaker Mar 18, 2020
08713b1
Added tests for snapshots.Store
erikgrinaker Mar 18, 2020
56fc9d9
Minor tweaks
erikgrinaker Mar 18, 2020
394e097
go.mod: use iavl 0.13.2
erikgrinaker Mar 19, 2020
316dace
Updated changelog
erikgrinaker Mar 19, 2020
a39974f
Appease linter
erikgrinaker Mar 19, 2020
e984eee
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 19, 2020
d8d455d
Added snapshot options
erikgrinaker Mar 19, 2020
bef3c86
Fix nil dereferencing in chunkWriter.CloseWithError()
erikgrinaker Mar 19, 2020
c3528dc
Merge branch 'master' into erik/snapshot
erikgrinaker Mar 26, 2020
076865a
Protobuf formatting fix
erikgrinaker Mar 26, 2020
fe8c88b
Return chunk metadata as well in Store.LoadChunk()
erikgrinaker Mar 26, 2020
33fa307
Add snapshots.Restorer()
erikgrinaker Mar 26, 2020
a34e12c
Typo
erikgrinaker Mar 27, 2020
ba031f6
Use table-driven tests for rootmulti.Store snapshot/restore error tests
erikgrinaker Mar 28, 2020
e507801
use zlib compression for snapshots
erikgrinaker Mar 28, 2020
3df5b25
add checksum test for snapshot format stability
erikgrinaker Mar 28, 2020
09ff5c1
use larger generated dataset for snapshot checksum test
erikgrinaker Mar 28, 2020
2f7d969
simplify snapshot management for new ABCI interface
erikgrinaker Mar 29, 2020
67c0faa
bump snapshot chunk size to 10 MB
erikgrinaker Mar 29, 2020
828ae14
use sha256 hashes for snapshot chunks
erikgrinaker Mar 29, 2020
5714fed
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 1, 2020
624c4f5
snapshots: added Store.GetLatest()
erikgrinaker Apr 1, 2020
d3afa8b
baseapp: check for newer snapshots, to avoid snapshotting during replay
erikgrinaker Apr 1, 2020
c46438e
baseapp: fix nil dereferencing
erikgrinaker Apr 1, 2020
5cac251
Implemented ABCI snapshot skeleton
erikgrinaker Mar 26, 2020
857b3dd
Ported Tendermint API changes
erikgrinaker Mar 26, 2020
7d0009c
Implement ABCI snapshot interface
erikgrinaker Mar 26, 2020
994b731
don't limit ListSnapshots to 100, caller can do this
erikgrinaker Mar 28, 2020
ab71bcb
updated with simplified ABCI interface
erikgrinaker Mar 29, 2020
a753a0e
update with new chunk size
erikgrinaker Mar 29, 2020
76b79b1
use sha-256 chunk hashes
erikgrinaker Mar 29, 2020
5afe4f4
update with TM rpc/client changes
erikgrinaker Apr 3, 2020
836fa5e
add snapshots.Manager, restructure code, and write tests
erikgrinaker Apr 3, 2020
8efccd5
reduce timeout
erikgrinaker Apr 4, 2020
679c2f0
add test for restoring empty IAVL stores
erikgrinaker Apr 4, 2020
7a81f0a
go.mod: bump iavl
erikgrinaker Apr 5, 2020
fb6580c
check for error when importing IAVL nodes
erikgrinaker Apr 14, 2020
f895ca1
handle empty keys and values via Protobuf
erikgrinaker Apr 14, 2020
94f1e44
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 24, 2020
d2fb90e
change tmkv.Pair to abci.EventAttribute
erikgrinaker Apr 24, 2020
432ff4a
initial port to new state sync ABCI interface
erikgrinaker Apr 24, 2020
3c52b05
don't snapshot mem.Store stores
erikgrinaker Apr 24, 2020
e7cb0be
Merge branch 'master' into erik/snapshot
erikgrinaker Apr 29, 2020
5a9def7
minor tweaks
erikgrinaker May 6, 2020
0888800
handle chunk hash verification
erikgrinaker May 6, 2020
683bd35
use pruning options for config
erikgrinaker May 6, 2020
bf04df1
improve error handling
erikgrinaker May 6, 2020
963432f
remove snapshot flags
erikgrinaker May 6, 2020
4c3cdac
use new ABCI enums
erikgrinaker May 7, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,14 @@ to now accept a `codec.JSONMarshaler` for modular serialization of genesis state
* (keys) [\#5820](https://github.com/cosmos/cosmos-sdk/pull/5820/) Removed method CloseDB from Keybase interface.
* (baseapp) [\#5837](https://github.com/cosmos/cosmos-sdk/issues/5837) Transaction simulation now returns a `SimulationResponse` which contains the `GasInfo` and
`Result` from the execution.
* (store) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) The `store.CommitMultiStore` interface now includes the new `snapshots.Snapshotter` interface as well.
* (crypto/keyring) [\#5866](https://github.com/cosmos/cosmos-sdk/pull/5866) Move `Keyring` and `Keybase` implementations and their associated types from `crypto/keys/` to `crypto/keyring/`.
* (crypto) [\#5880](https://github.com/cosmos/cosmos-sdk/pull/5880) Merge `crypto/keys/mintkey` into `crypto`.
* (crypto/keyring) [\#5858](https://github.com/cosmos/cosmos-sdk/pull/5858) Make Keyring store keys by name and address's hexbytes representation.
* (crypto/keyring) [\#5889](https://github.com/cosmos/cosmos-sdk/pull/5889) Deprecate old keybase implementation:
- Remove `Update` from the `Keybase` interface.
- `NewKeyring()` now accepts a new backend: `MemoryBackend`.
- `New()` has been renamed to`NewLegacy()`, which now returns a `LegacyKeybase` type that only allows migration of keys from the legacy keybase to a new keyring.
* (client/input) [\#5904](https://github.com/cosmos/cosmos-sdk/pull/5904) Removal of unnecessary `GetCheckPassword`, `PrintPrefixed` functions.
* (client/keys) [\#5889](https://github.com/cosmos/cosmos-sdk/pull/5889) Rename `NewKeyBaseFromDir()` -> `NewLegacyKeyBaseFromDir()`.
* (crypto) [\#5880](https://github.com/cosmos/cosmos-sdk/pull/5880) Merge `crypto/keys/mintkey` into `crypto`.
Expand All @@ -99,6 +107,8 @@ information on how to implement the new `Keyring` interface.
### Features

* (x/ibc) [\#5588](https://github.com/cosmos/cosmos-sdk/pull/5588) Add [ICS 024 - Host State Machine Requirements](https://github.com/cosmos/ics/tree/master/spec/ics-024-host-requirements) subpackage to `x/ibc` module.
* (baseapp) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) Added support for taking state snapshots at regular height intervals, via options `snapshot-interval` and `snapshot-retention`.
* (store) [\#5803](https://github.com/cosmos/cosmos-sdk/pull/5803) Added `rootmulti.Store` methods for taking and restoring snapshots, based on `iavl.Store` export/import.
* (x/ibc) [\#5277](https://github.com/cosmos/cosmos-sdk/pull/5277) `x/ibc` changes from IBC alpha. For more details check the the [`x/ibc/spec`](https://github.com/cosmos/tree/master/x/ibc/spec) directory:
* [ICS 002 - Client Semantics](https://github.com/cosmos/ics/tree/master/spec/ics-002-client-semantics) subpackage
* [ICS 003 - Connection Semantics](https://github.com/cosmos/ics/blob/master/spec/ics-003-connection-semantics) subpackage
Expand Down
121 changes: 121 additions & 0 deletions baseapp/abci.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package baseapp

import (
"errors"
"fmt"
"os"
"sort"
Expand All @@ -10,6 +11,7 @@ import (
abci "github.com/tendermint/tendermint/abci/types"

"github.com/cosmos/cosmos-sdk/codec"
"github.com/cosmos/cosmos-sdk/snapshots"
sdk "github.com/cosmos/cosmos-sdk/types"
sdkerrors "github.com/cosmos/cosmos-sdk/types/errors"
)
Expand Down Expand Up @@ -263,6 +265,10 @@ func (app *BaseApp) Commit() (res abci.ResponseCommit) {
app.halt()
}

if app.snapshotInterval > 0 && uint64(header.Height)%app.snapshotInterval == 0 {
go app.snapshot(header.Height)
}

return abci.ResponseCommit{
Data: commitID.Hash,
}
Expand Down Expand Up @@ -290,6 +296,27 @@ func (app *BaseApp) halt() {
os.Exit(0)
}

// snapshot takes a snapshot of the current state and prunes any old snapshots.
func (app *BaseApp) snapshot(height int64) {
app.logger.Info("Taking state snapshot", "height", height)
snapshot, err := app.snapshotManager.Take(uint64(height))
if err != nil {
app.logger.Error("Failed to take state snapshot", "height", height, "err", err)
return
}
app.logger.Info("Completed state snapshot", "height", height, "format", snapshot.Format)

if snapshotRetention > 0 {
app.logger.Debug("Pruning state snapshots")
pruned, err := app.snapshotManager.Prune(snapshotRetention)
if err != nil {
app.logger.Error("Failed to prune state snapshots", "err", err.Error())
return
}
app.logger.Debug("Pruned state snapshots", "pruned", pruned)
}
}

// Query implements the ABCI interface. It delegates to CommitMultiStore if it
// implements Queryable.
func (app *BaseApp) Query(req abci.RequestQuery) abci.ResponseQuery {
Expand All @@ -316,6 +343,100 @@ func (app *BaseApp) Query(req abci.RequestQuery) abci.ResponseQuery {
return sdkerrors.QueryResult(sdkerrors.Wrap(sdkerrors.ErrUnknownRequest, "unknown query path"))
}

// ListSnapshots implements the ABCI interface. It delegates to app.snapshotManager if set.
func (app *BaseApp) ListSnapshots(req abci.RequestListSnapshots) abci.ResponseListSnapshots {
resp := abci.ResponseListSnapshots{Snapshots: []*abci.Snapshot{}}
if app.snapshotManager == nil {
return resp
}

snapshots, err := app.snapshotManager.List()
if err != nil {
app.logger.Error("Failed to list snapshots", "err", err.Error())
return resp
}
for _, snapshot := range snapshots {
abciSnapshot, err := snapshot.ToABCI()
if err != nil {
app.logger.Error("Failed to list snapshots", "err", err.Error())
return resp
}
resp.Snapshots = append(resp.Snapshots, &abciSnapshot)
}

return resp
}

// LoadSnapshotChunk implements the ABCI interface. It delegates to app.snapshotManager if set.
func (app *BaseApp) LoadSnapshotChunk(req abci.RequestLoadSnapshotChunk) abci.ResponseLoadSnapshotChunk {
if app.snapshotManager == nil {
return abci.ResponseLoadSnapshotChunk{}
}
chunk, err := app.snapshotManager.LoadChunk(req.Height, req.Format, req.Chunk)
if err != nil {
app.logger.Error("Failed to load snapshot chunk", "height", req.Height, "format", req.Format,
"chunk", req.Chunk, "err", err.Error())
return abci.ResponseLoadSnapshotChunk{}
}
return abci.ResponseLoadSnapshotChunk{Chunk: chunk}
}

// OfferSnapshot implements the ABCI interface. It delegates to app.snapshotManager if set.
func (app *BaseApp) OfferSnapshot(req abci.RequestOfferSnapshot) abci.ResponseOfferSnapshot {
if req.Snapshot == nil {
app.logger.Error("Received nil snapshot")
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_REJECT}
}

snapshot, err := snapshots.SnapshotFromABCI(req.Snapshot)
if err != nil {
app.logger.Error("Failed to decode snapshot metadata", "err", err)
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_REJECT}
}
err = app.snapshotManager.Restore(snapshot)
switch {
case err == nil:
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_ACCEPT}

case errors.Is(err, snapshots.ErrUnknownFormat):
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_REJECT_FORMAT}

case errors.Is(err, snapshots.ErrInvalidMetadata):
app.logger.Error("Rejecting invalid snapshot", "height", req.Snapshot.Height,
"format", req.Snapshot.Format, "err", err.Error())
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_REJECT}

default:
app.logger.Error("Failed to restore snapshot", "height", req.Snapshot.Height,
"format", req.Snapshot.Format, "err", err.Error())
// We currently don't support resetting the IAVL stores and retrying a different snapshot,
// so we ask Tendermint to abort all snapshot restoration.
return abci.ResponseOfferSnapshot{Result: abci.ResponseOfferSnapshot_ABORT}
}
}

// ApplySnapshotChunk implements the ABCI interface. It delegates to app.snapshotManager if set.
func (app *BaseApp) ApplySnapshotChunk(req abci.RequestApplySnapshotChunk) abci.ResponseApplySnapshotChunk {
_, err := app.snapshotManager.RestoreChunk(req.Chunk)
switch {
case err == nil:
return abci.ResponseApplySnapshotChunk{Result: abci.ResponseApplySnapshotChunk_ACCEPT}

case errors.Is(err, snapshots.ErrChunkHashMismatch):
app.logger.Error("Chunk checksum mismatch, rejecting sender and requesting refetch",
"chunk", req.Index, "sender", req.Sender, "err", err)
return abci.ResponseApplySnapshotChunk{
Result: abci.ResponseApplySnapshotChunk_RETRY,
RefetchChunks: []uint32{req.Index},
RejectSenders: []string{req.Sender},
}

default:
app.logger.Error("Failed to restore snapshot", "err", err.Error())
return abci.ResponseApplySnapshotChunk{Result: abci.ResponseApplySnapshotChunk_ABORT}
}
}

func handleQueryApp(app *BaseApp, path []string, req abci.RequestQuery) abci.ResponseQuery {
if len(path) >= 2 {
switch path[1] {
Expand Down
6 changes: 6 additions & 0 deletions baseapp/baseapp.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ import (
"github.com/tendermint/tendermint/libs/log"
dbm "github.com/tendermint/tm-db"

"github.com/cosmos/cosmos-sdk/snapshots"
"github.com/cosmos/cosmos-sdk/store"
sdk "github.com/cosmos/cosmos-sdk/types"
sdkerrors "github.com/cosmos/cosmos-sdk/types/errors"
)

const (
snapshotRetention = 2
runTxModeCheck runTxMode = iota // Check a transaction
runTxModeReCheck // Recheck a (pending) transaction after a commit
runTxModeSimulate // Simulate a transaction
Expand Down Expand Up @@ -62,6 +64,10 @@ type BaseApp struct { // nolint: maligned
idPeerFilter sdk.PeerFilter // filter peers by node ID
fauxMerkleMode bool // if true, IAVL MountStores uses MountStoresDB for simulation speed.

// manages snapshots, i.e. dumps of app state at certain intervals
snapshotManager *snapshots.Manager
snapshotInterval uint64 // block interval between snapshots - must be multiple of IAVL SnapshotEvery

// volatile states:
//
// checkState is set on InitChain and reset on Commit
Expand Down
Loading