Skip to content
This repository has been archived by the owner on Feb 23, 2022. It is now read-only.

RFC: Soft Chain Upgrades #222

Closed
wants to merge 8 commits into from
Closed

RFC: Soft Chain Upgrades #222

wants to merge 8 commits into from

Conversation

cmwaters
Copy link
Contributor

@cmwaters cmwaters commented Nov 16, 2020

This is a RFC for the proposal of adding support for soft upgrades, allowing for the lifespan of a chain to be across multiple block versions with upgrading happening whilst the network is still live.

Rendered

rfc/005-soft-chain-upgrades.md Outdated Show resolved Hide resolved
rfc/005-soft-chain-upgrades.md Outdated Show resolved Hide resolved
rfc/005-soft-chain-upgrades.md Outdated Show resolved Hide resolved
rfc/005-soft-chain-upgrades.md Outdated Show resolved Hide resolved
Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good start although a lot of details are still missing.

question: if a node A is on block protocol v1.4 and node B is on v1.3 and node A sends a block proposal to node B, how would node B go about handling this? Will it just ignore it since it can't read it yet (because it doesn't yet downloaded a new version)?

rfc/005-soft-chain-upgrades.md Show resolved Hide resolved
@cmwaters
Copy link
Contributor Author

cmwaters commented Nov 20, 2020

EDIT: Most of the information here has been integrated into the RFC itself

I have been thinking about chain migration and have come up with the following...

Chain Migration

Imperatives

  • All state modifying data from the original block must be preserved
  • Light client and fast sync verification models must be unaffected

Strategy

Migration scripts will need to be tailored to what the exact changes entail but the general notion is that the validator set directly following the upgrade signs not just the previous block but finalizes on the migration of the entire history.

Problems

Whenever we change the header (or anything in the block for that matter as this too will mean the hashes in the header will need to change), then we break the signatures in the commit. At least 2/3 of the validators in the set at that height agreed on a blockID, the hash of the original header, which has since changed. This block ID can be transferred across to the derived block but we need some way of verifying that this new block does actually derived from the original one. This is what the validators at the height of the upgrade need to agree on, however, not just for a single block but the entire history.

Possible Solutions

This is a random walk through the solution space at what could be done to achieve version upgrading whilst following the aforementioned imperatives.

Simple Approach

One basic approach is that each derived version should be capable of producing its original i.e. If the chain is at version 4 and a node is fast syncing and is at block 100 which corresponds originally to version 2 then when it receives a derived block it should be able to extract out the original for verification purposes. However, the derived block should also reflect the structure of the latest version.

The shortcomings of this is that we would expect the derived block stored to be larger than normal as it needs to hold the relevant data for both. This would go against a lot of the proposed plans on the horizon which would aim to reduce the overall size of the block. Verification would take a little longer but we wouldn't need to worry about distant future validators signing on the change as it shouldn't be possible to introduce any byzantine behavior (either the derived block can produce the original block or it can't).

Advanced Approach

I've been thinking a lot longer about a second approach in which we don't need to be able to rebuild the original block to verify all the way up to the migration height. Currently we have what can be seen as a core verification path. These are a set of fields or features that are essential for Tendermint's verification process. This allows a node to be able to trust a header. The header is then filled with whatever other hashes that allows the user to verify the other components i.e. data, evidence, last commit, consensus, app. This guarantees state replication.

With chain migration, we implicitly create a division in the types of blocks: Those blocks that have been migrated from a previous version and those that are part of the latest version. These blocks of course ideally have the same structure but will most likely need to be treated differently because a migrated block has broken signatures whereas the latest ones don't.

Consensus, albeit for the actual agreement of a version change, remains largely unaffected from migration. However, fast syncing and light clients (which basically use the same forms of verification) would. To quickly recap how verification works we start with a trusted validator set. From this we search for a commit and header that correlate to the height directly above our current state. We rebuild the votes from the commit and verify that the signatures are all for the hash of the header we received by calling VerifyCommit:

func (vals *ValidatorSet) VerifyCommit(chainID string, blockID BlockID,
	height int64, commit *Commit) error

We can then trust the header and as aforementioned trust the rest of the block. Then tendermint delivers the txs to the application which will in turn update the validator set (with an extra height delay). We use NextValidatorsHash to verify that we have the correct trusted validator set for the next height.

If we were to migrate data to a new block, then we could take the original blockID across. This would mean that we could verify this BlockID but we would not be able to trust the rest of the contents of the header / block which means we wouldn't be able to know whether the state transition was correct or if the new validator set could be trusted with which to continue verification.

A solution to this would be as follows:

We have a block that has a set of unalterable fields. They can't change and are essential to the core verification path.

type Block {
	...
	DerivedBlockHash // derives from the original block that was signed by
									 // the validator set
	NextValidatorHash
	...
	LastBlockID // This refers to the migrated block header not the original
	LastCommit // or anything that houses the signatures
	...
}

I'm assuming the following relationship

f(DerivedBlockHash, NextValidatorHash) = Hash of the original header / BlockID

Part of the migration script would then be to calculate this DerivedBlockHash or if it was the migration of an already migrated block then just to carry it across.

Verification would be as follows:

  1. Starting from a trusted validator set (this could be from genesis)
  2. Grab the migrated block at the next height or more specifically the signatures, next validator hash and derived block hash
  3. Calculate the original header hash that the signatures should be signing with the next validator hash and the derived block hash.
  4. Check that the LastBlockID is equal to the hash of the trusted header (no need to check this is height is 1)
  5. Verify that 2/3 signed the original header hash by running VerifyCommit. If no error then we can trust at least that the original block ID is correct and extend that to trusting that the NextValidatorHash is correct.
  6. Apply block to state. At the moment we still can't trust that the txs were legitimate.
  7. Check that the state's ValidatorHash (at height + 1 now) matches the NextValidatorsHash. If so it means that although we can't trust state we can't trust the new validator set.
  8. Go back to 1 and recur until we reach a point where the DerivedBlockHash is nil. This indicates the crossover from the migrated blocks to the original ones.
  9. When validating this block we return to the normal verification process. This means that instead of using the derived block we take the signatures in the commit and verify it against the hash of the entire header. This means that we not only the next validator hash but can trust the entire header. This includes LastBlockID.
  10. Verify that LastBlockID equates to the hash of the last migrated block. If this is equal then we have essentially verified the entire migration history and we can trust that the state transitions that have been applied are correct.
  11. Proceed as normal until the node can transition to consensus.

NOTE: The process outline is deemed as the minimum fields that need to be migrated across. Currently this process means that we can't trust state until we reach the crossover point. However, this shortcoming isn't necessary. If we decide to migrate some form of state/app hash such that this still holds:

f(AppHash + NextValidatorsHash, DerivedBlockHash) = Original Block ID

Then we can incrementally verify state transitions. This means we don't have to download the entire migration part of the blockchain before we know that state is valid.

In this regards DerivedBlockHash serves as the remnants of the hashes/data in the old header that we don't want to bring across because we have changed it.

In terms of the actual upgrade, when consensus is reached on a block that causes the block version to increment, the next block proposed should not only have the new block structure but the LastBlockID should be that of the migrated block not the original one. This means that before the next block of the new version can be agreed upon everyone participating in consensus must have migrated the entire history in order to generate the "golden hash", the cumulative hash of the migrated history. If this is three years of blocks this migration could take a long time before consensus can continue hence it's probably an important aspect to attempt to make migration asynchronous so nodes already know this golden hash before the upgrade actually happens.

TL;DR

Using migration scripts aside from a few caveats could actually work.

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Nov 20, 2020

Thanks for thinking about this @cmwaters! My instinct is the same as yours, I think this seems viable (although the devil is in the details here).

The shortcomings of this is that we would expect the derived block stored to be larger than normal as it needs to hold the relevant data for both. This would go against a lot of the proposed plans on the horizon which would aim to reduce the overall size of the block.

At scale the transaction data will presumably be much larger than the block header, so adding on an extra header or some extra fields has a negligible cost that I think will be well worth it.

Currently this process means that we can't trust state until we reach the crossover point. However, this shortcoming isn't necessary. If we decide to migrate some form of state/app hash.

I think we probably don't want to migrate transactions data (and thus application state), so requiring the app hash seems fine - or even desirable, since we want to know as soon as possible if a state transition is invalid.

We could support migrating transactions too (which might help with the SDK upgrade story as well), in which case the application would have to provide new app hashes for each derived header, but I'm not sure this is something we want to do in the first iteration - worth considering though.

If this is three years of blocks this migration could take a long time before consensus can continue hence it's probably an important aspect to attempt to make migration asynchronous so nodes already know this golden hash before the upgrade actually happens.

Good point. The simplest would be to run the migrations synchronously at the upgrade height, and have consensus halt until a quorum agrees on the next hash, but this would result in significant network downtime. If we want this to be asynchronous, consensus would have to support the previous and next block structures, such that it can rewrite old blocks in the background, and then switch over to the new block format at a specific height with all history already rewritten. Totally possible, just a bit more work and complexity - having the migration separate from the main Tendermint daemon would be simpler.

A middle ground might be allowing a separate migration script to create the derived structures while Tendermint is also producing blocks, such that most of the history can be rewritten "offline", and at the upgrade height we'll only need to shut down while migrating e.g. the last 1000 blocks or so - which would presumably only take a few minutes.

@cmwaters
Copy link
Contributor Author

I think we probably don't want to migrate transactions data (and thus application state), so requiring the app hash seems fine - or even desirable, since we want to know as soon as possible if a state transition is invalid.

Yes I think, at least to start with, we should make AppHash part of the core verification data.

A middle ground might be allowing a separate migration script to create the derived structures while Tendermint is also producing blocks, such that most of the history can be rewritten "offline", and at the upgrade height we'll only need to shut down while migrating e.g. the last 1000 blocks or so - which would presumably only take a few minutes.

I think this makes the most sense: migrating perhaps just the derived headers in the background such that consensus has the information it needs to operate continuously then making the full transition of the rest of the block structure after the upgrade height.

@cmwaters cmwaters self-assigned this Dec 3, 2020

**Positive:**

- It requires no changes to the database therefore keeps the immutable aspect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help if this property is mentioned at the beginning of the doc, together with other nice to have properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be better to describe the properties of the methods first and then the method itself afterwards?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think stating properties upfront has the value on its own as someone might figure out different way to address it we haven't thought of. So I would not think necessarily about this as stating properties of the methods, but more properties of the problem you are trying to solve, and then different methods will fulfil some of the properties.

Soft and hard upgrades both refer to major releases but have very different
properties.

A soft upgrade like minor upgrades can happen asynchronously across the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that what is meant with soft upgrade is supporting nodes and clients with multiple versions to be present in the network and operate correctly. Hard fork requires everyone to be on the same version. Is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A soft fork is when the latest software version i.e 0.34.0 can support an entire blockchain. For example if going from v6 to v7 is a soft fork then v7 must support the same blockchain that v6 supported whether through multi-version support or migrations. By support, I essentially mean be able to verify all data structures. A hard fork means that the latest version no longer supports the prior blockchain. This requires a new blockchain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what about previous versions of software (since the last hard-fork)? Will it still expected to work with at least heights before soft-fork they are lagging behind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on the implementation. With the migrations, you probably won't be able to operate with a lesser version (even at a lower height) because all the blocks will be updated to the latest version and other peers will be serving these latest blocks. Perhaps if you were peered with other nodes that were on the previous version then they would still be serving those blocks. But once you reached the soft upgrade height you wouldn't be able to go any further.

This last bit would be the same with the multi support version. Nodes on prior versions could still process earlier blocks but would halt when they reached the soft upgrade height

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess something else to consider is that because we're running this off consensus, it means we could also get version-forks if 2/3 of validating power is malicious. This would mean that the network forks with some nodes following version x and others following version y

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to, at some point, make a table of all the version numbers (ABCI, block protocol, etc) that are supported by each software version. This might also help illustrate the relationships between software version numbers and protocol version numbers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we could have the table in the UPGRADES file or somewhere else. It would also be nice to add this in the tendermint version command.

I was thinking that a group of ABCI, block and p2p versions could all correlate to a spec version and then a software version just maps out which spec versions it covers although I'm not sure this would exactly work

@cmwaters
Copy link
Contributor Author

@marbar3778 found a blockchain project that is already implementing multi-version support that is quite similar to what I imagined/described above. If anyone is interested in seeing how it would work you can look here: https://github.com/harmony-one/harmony/tree/main/block

Copy link
Contributor

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up! I think I'd be much more in favour of the first method (Multi Data Structure Support). While it does add some maintenance overhead, it seems less magical/surprising. I also think it could really help to decouple parts of the codebase and gain clarity on the APIs and relationships, which could lead to net maintenance benefits in the long term. Also I wonder if having one software version that works across block breaking versions will reduce the need for backport releases if things can be fixed in a single version and people can upgrade to it since its backwards compatible.

Of course the devils in the details here, and we would probably benefit from some more preliminary work to actually see what this would look like in a real example or two, and help build a sense of things we could start doing to better prepare the codebase for it. For instance what would this look like for an mvp proposer-timestamp change, or for an mvp header change?

comply with the announced block protocol upgrade. This can be done
asynchronously with node operators temporarily halting their node, changing to
the new binary and starting up again. Tendermint could also offer a tool to
easily swap between binaries thus reducing the amount of down-time that nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is basically what Regen has built in the https://github.com/cosmos/cosmos-sdk/tree/master/cosmovisor ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup I was thinking of the cosmosvisor when I wrote this (although I'm not too familiar with it)

ensure that the nodes could transition smoothly. It may be that the actual
upgrade might not take place till height: h + 2.

As we expect upgrades to always increment the block version by one, rather
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to enforce this? I could imagine an upgrade skipping some versions, but maybe the idea is that it will be easier to provide backwards compatibility in the software for 1 rather than 2 or more past versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could go either way on this as long as we communicate it clearly.

I can think of one reason to assume/enforce "linear" (non-version-skipping), which is that it could let us drop support for earlier protocol versions as we release new software versions. This could help limit the amount of older code that we have to maintain over time.

As an example, imagine that Software Version X supports Block Versions Y and Y+1. When you want to introduce Block Version Y+2, you could release Software Version X+1, which would support Y+1 and Y+2 but could drop support for Block Version Y.

(This would also place restrictions on how many different versions can run within a network at once, but that's much easier to coordinate than a single hard upgrade.)

Copy link
Contributor Author

@cmwaters cmwaters Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm no. I was rather presuming this nature but I agree that it might be beneficial to allow skipping versions (even potentially reverting versions if for example a bug was found)
^ This was written before tess' comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, imagine that Software Version X supports Block Versions Y and Y+1. When you want to introduce Block Version Y+2, you could release Software Version X+1, which would support Y+1 and Y+2 but could drop support for Block Version Y.

Hmm, the problem with this is that all the heights that were made during Block Version Y are now lost (or can only be retrieved by processes that can still read and validate Block Version Y. This is not a problem for light clients that only follow the last 10,000 or so blocks but for nodes fast syncing or state syncing this is impossible unless they download the versions and there are archive nodes still serving blocks of those versions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, good point!

As we expect upgrades to always increment the block version by one, rather
than having the version number being passed in `EndBlockResponse{}`, another
option could be to have the height where we want the upgrade to occur. With this
approach validators are not reaching consensus on the version but the window of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what this is saying, maybe can be clarified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the app could return, in EndBlock, the height with which the upgrade would occur rather than what version to run on (so consensus could agree on the upgrade height) but on second thought I don't think it's a good idea so just scratch this sentence.

required to mitigate errors.


## Method 2: Chain Migration
Copy link
Contributor

@ebuchman ebuchman Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sounds pretty complex and I'm not sure it's worth it in the end, especially since it seems you either need a bunch more storage if you want to regenerate the original blocks or you basically need to rerun consensus for the whole blockchain or something? Also need to be careful if you're resigning things to make sure there's no risk of double signing.

I'd be open to looking at this more closely if others felt strongly it was the right way to go but it seems like the Mutli Data Structure approach is more straight forward and might even come with some maintenance benefits in the end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the big downside to the Multi Data Structure approach is that it puts some maintenance burden on clients and wallets as well. This is OK if they only need to support a handful of versions, but it could be prohibitively complex if we introduce a dozen versions or something


The scope of this RFC is to address the need for Tendermint to have better
flexibility in improving itself whilst offering greater ease of use to the
networks running on Tendermint. This is done by introducing a division in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a nit, but we're not really introducing a new "division" here - just striving to enable a wider range of changes as "soft upgrades."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a concept of soft upgrades before in tendermint?

I'm guessing before this, operators would perform minor upgrades to their nodes within a live network (especially if it was a security fix) in their own time - so this would constitute a soft upgrade.

In this case you're right. In the pie of possible upgrades we are trying to get a bigger slice of them to not require a hard reset.

changing the binary and starting the node again.

A patch release (the last number in x.x.x), is a backwards compatible bug-fix.
Nodes should be able to perform this upgrade at any time without any affect to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/affect/effect


A minor release (the middle number in x.x.x), is also backwards compatible but
it indicates changes in existing features or new features. One can think of
performance optimizations as a good example. Nodes should also be able to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other good examples include marking fields/params as deprecated, and (I believe) adding new fields to protocol buffers (but not changing or removing them).

structures from a prior major version. Thus, an upgrade for a major release
has so far meant having to create a new chain.

Soft and hard upgrades both refer to major releases but have very different
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this appendix; it does a great job of getting everyone on the same page and clearing assumptions.

One thing I'd note is that I think we could technically include soft upgrades in minor releases, if we assume that the terms "major"/"minor"/"patch" refer to the Tendermint APIs (including RPC and Go APIs). For example, we could add a new optional field and increment the block protocol number, and as long as it's a soft change, we wouldn't need to do a major Tendermint release.

We may decide that we don't want to do that as a matter of policy, but I think it's technically possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think all soft upgrades are major releases because external clients need to change their code in order to verify the new block format.

I guess my question here is whether an incompatible change constitutes only readability or also verifiability. If it's just about reading the blocks then maybe some soft upgrades could be released in a minor version

Soft and hard upgrades both refer to major releases but have very different
properties.

A soft upgrade like minor upgrades can happen asynchronously across the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to, at some point, make a table of all the version numbers (ABCI, block protocol, etc) that are supported by each software version. This might also help illustrate the relationships between software version numbers and protocol version numbers.


### Protocol Versioning

Tendermint currently has three protocol versions: Block, P2P and App.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you draw a distinction between "protocol versions" and other versions? Where does the ABCI version fit in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be helpful to add more context on this? I didn't personally come up with the protocol versions but the abstraction does make sense in terms of separate parts of the code.

Another way that I thought this could be done is to look at it from the stakeholders angle. We kind of have three stakeholders: the application developers, the node operators, and the "end users" which could be wallets and block explorers. A nice way to version things would be to indicate whether the changes require an involvement from that stakeholder. So ABCI and App version correlates to application developers, RPC version (the one I proposed) correlates with external clients. I'm not sure where the node operators fit but they would need to make changes to their binaries to support different block versions (and potentially p2p versions).

ensure that the nodes could transition smoothly. It may be that the actual
upgrade might not take place till height: h + 2.

As we expect upgrades to always increment the block version by one, rather
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could go either way on this as long as we communicate it clearly.

I can think of one reason to assume/enforce "linear" (non-version-skipping), which is that it could let us drop support for earlier protocol versions as we release new software versions. This could help limit the amount of older code that we have to maintain over time.

As an example, imagine that Software Version X supports Block Versions Y and Y+1. When you want to introduce Block Version Y+2, you could release Software Version X+1, which would support Y+1 and Y+2 but could drop support for Block Version Y.

(This would also place restrictions on how many different versions can run within a network at once, but that's much easier to coordinate than a single hard upgrade.)

behavior).

Another similar approach is to have the prior versions in a different repo and
import it as a library. This may minimize the amount of code in the repo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it may lead to dependency hell, but if this were a priority, we could structure our codebase to support it.

required to mitigate errors.


## Method 2: Chain Migration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the big downside to the Multi Data Structure approach is that it puts some maintenance burden on clients and wallets as well. This is OK if they only need to support a handful of versions, but it could be prohibitively complex if we introduce a dozen versions or something

might offer a broader set of changes that can be soft upgraded.

**Negative**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add to the negatives that this is also the more complex solution to implement in the short term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is sort of implied, but good to call out explicitly.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: Requires building some kind of separate migration tool, which probably wouldn't get exercised that often and would be at-risk for getting musty and outdated

@cmwaters
Copy link
Contributor Author

I want to provide a quick update as to the current stage of this RFC and a quick outline of how I plan to move it forward.

It seems that there is a growing preference towards using multi data structure support (MDSS) as a means of enabling soft chain upgrades. With this in mind I would like to give the holiday break as one last period for any opposing views / ideas towards MDSS to surface. If we are all convinced by this approach come the new year, then I will begin rewriting this RFC, narrowing down the scope by moving the migrations section into the alternatives section (with the bulk of it residing in the appendix) and focusing purely on MDSS, with a clearer distinction between what changes are a soft and hard upgrade and how exactly will soft upgrades be executed.

Following from this, I would then appreciate getting some final feedback and a thumbs up from everyone if they are happy with this and accept the proposal.

The steps following from this RFC will be:

  • To write a page on how Tendermint conducts versioning which will reside in the spec repo.
  • To write up an ADR for how to implement what has been decided in this RFC (Tendermint-Go)
  • To write a document (whether it be another RFC or something else) on a proposal for improving the UX / data availability of hard upgrades.

@tessr
Copy link
Contributor

tessr commented Dec 22, 2020

That sounds like a great plan, @cmwaters! I'd say the rewrite of this RFC should definitely include some discussion of the alternative to MDSS and perhaps a record of our decision-making, but there's no need to dig deeply into the details there. Also no need to rush into this before the holidays; I'll share this a little more broadly to solicit more feedback during that time.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Jan 22, 2021
@cmwaters cmwaters removed the Stale label Jan 27, 2021
@cmwaters cmwaters marked this pull request as draft February 17, 2021 11:23
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Mar 20, 2021
@cmwaters cmwaters removed the Stale label Mar 22, 2021
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Apr 22, 2021
@cmwaters cmwaters removed the Stale label Apr 23, 2021
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants