Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fastx dist sys] Design and implement re-configuration #75

Closed
gdanezis opened this issue Dec 20, 2021 · 21 comments
Closed

[fastx dist sys] Design and implement re-configuration #75

gdanezis opened this issue Dec 20, 2021 · 21 comments
Assignees
Labels

Comments

@gdanezis
Copy link
Collaborator

fastx operates in epochs. Within each epoch committee and stake are stable, but between epochs they may change. Between epoch a rec-configuration protocol is ran between validators and a consensus system to ensure safety and liveness. We need to implement this re-configuration mechanism, specifically:

  • The logic by which a validator closes an epoch, and asks 2f+1 validators of the next epoch to provide a certificate over the set of certificates processed.
  • The logic by which an epoch starts through a validator accepting the first 2f+1 certificates (on the consensus core), initializes its database, and start processing transactions.
  • Efficiency improvements to minimize downtime.
  • Make decisions and implement the smart contract on an L1/consensus core driving the governance logic, and sequencing the certificates of state between epochs.
@gdanezis gdanezis changed the title [fastx dist sys] Design and implement re-configuration (Master Task) [fastx dist sys] Design and implement re-configuration (Design Task) Jan 13, 2022
@lxfind lxfind changed the title [fastx dist sys] Design and implement re-configuration (Design Task) [fastx dist sys] Design and implement re-configuration Jan 21, 2022
@lxfind lxfind added this to the TBD milestone Jan 21, 2022
@gdanezis
Copy link
Collaborator Author

gdanezis commented Feb 1, 2022

Capturing a key conversation with @sblackshear :

It would be nice if our reliance on an external L1 was minimal (to none) and we can go some way to do this by reusing as much as possible fastx native operations to express state that is used in reconfig:

  • Locking tokens to express stake can be done within fastx. This leads to a locked token object with a nominated delegate.
  • Authorities can express their wish to reconfigure, and a snapshop hash of their state within fastx, and then restrict the orders that can be processed to those from other authorities.
  • Authorities can express their approval of snapshots from other authorities as commands within fastx.

Give the above authorities are able to (1) close an epoch (2) gather 2f+1 authority states (or more) with 2f+1 votes each. Now comes the part that requires agreement: all authorities need to agree on the same 2f+1 set of certified states as the starting state for the next epoch. Then using this authoritative decision they can (3) decide the membership and stake distribution, and (4) start the next epoch.

The above outlines a design that only requires a validated one shot agreement, rather than a more complex core. And is an interesting option.

@oxade
Copy link
Contributor

oxade commented Feb 9, 2022

@gdanezis
Can an authority with considerable stake force an epoch if it leaves the network or dies?
Our notion of epochs seems like controlled transitions but is that always the case?

@gdanezis
Copy link
Collaborator Author

gdanezis commented Feb 9, 2022

Can an authority with considerable stake force an epoch if it leaves the network or dies?
Our notion of epochs seems like controlled transitions but is that always the case?

We do not have definite answers on this.

There is no conceptual issue (with respect to safety) with a strategy that gives authorities some discretion about when to end the epoch. So in such a system an authority could:

  • Vote to end the epoch after a time (end of day).
  • Vote to end the epoch if it sees the network having many crashed authorities.
  • Vote to end the epoch after a certain volume of certificates have been processed by it.

All that matters is that once f+1 out of 3f+1 of stake has voted to end the epoch (potentially for any personal reason), all others also vote to end the epoch up to 2f+1 votes, then we agree on the end-of-epoch-stake, derive the stake distribution for the next epoch, and re-start with that committee.

What I am trying to say is: the decision to end the epoch can be subjective for an authority (if we wish) -- and remember that in a distributed system time and timeouts are subjective.

So we can do any of these policies, or a combination -- the questions is what we should do out of all these choices.

@gdanezis
Copy link
Collaborator Author

gdanezis commented Feb 9, 2022

The above also points to the need of an emergency protocol in case some delegates with collectively more than f+1 stake become unavailable / crash / byzantine, to recover the network. This is obviously disaster recovery, so no one expects it to be pretty or efficient.

Some thoughts on this:

  • At this point we are operating in a synchronous setting, authority operators jump on discord.
  • Priority should be: gather a state based on certificates. Propose it for others to accept. Integrate re-delegation orders to come to a set of delegates that are live. Launch new epoch with this set.
  • If many different coalitions do the above we end up with forks.
  • How are bridges / light clients affected by all this?

@huitseeker
Copy link
Contributor

huitseeker commented Mar 1, 2022

FWIW, the two following papers claim to do reconfiguration without consensus:
https://www.microsoft.com/en-us/research/wp-content/uploads/2010/10/paper-1.pdf
https://arxiv.org/abs/1607.05344

They are interesting because they speak about storage systems, where the question of passing state on from one configuration to the next is bundled with agreeing on who the members of the new quorum are. So far I've seen a plethora of people doing the later, but the former, not so many.

@asonnino
Copy link
Contributor

asonnino commented Mar 2, 2022

We need consensus to support shared objects, but we are currently able to use it as a black box (it simply sequences bytes and is basically stateless). So if we end up using consensus for reconfiguration it would be great (if possible) to also use it in that way.

@asonnino
Copy link
Contributor

asonnino commented Mar 2, 2022

Linking the sync story as it may be relevant for checkpointing before committee reconfiguration: #194

@LefKok LefKok self-assigned this Mar 4, 2022
@LefKok
Copy link
Collaborator

LefKok commented Mar 9, 2022



The design below assumes an existing checkpoint functionality #194 :

We split the reconfiguration of an epoch into three steps:


1)Stake-recalculation step
2)New Committee Ready Step
3)Epoch Handover Step

The first step is triggered at a predetermined blockchain height. The first checkpoint after this height is considered the ‘’static stake distribution`` of the next epoch. Out of this, the new committee is well-defined. Care needs to be taken on the “chain quality” during the delegation period and before the checkpoint is issued. An easy way is to not charge extra for duplicate delegation transactions such that the delegators can send their votes to 2f+1 honest parties. If the honest parties do not see their delegation transactions approved, they can stop processing checkpoints until their delegations have gone through (more on this in OmniLedger).

Once the checkpoint is issued the new committee members at their own pace load it locally and each delegate sends a “bootstrap ready” transaction to the old committee. When a quorum of stake is ready the next checkpoint includes these messages signalling that the epoch is about to end. At this point only consensus blocks are allowed, fast-path no longer finalises (read below how to still allow fast path). Finally the 3rd step is triggered as the prior committee inserts an “epoch-handover” transaction at the end of their final checkpoint and then goes away (remaining available only for data availability). The next committee can take over right away and reenable the fast path.

This design decouples all the essential steps that are needed for a secure reconfiguration to occur. This decoupling allows for the necessary logic of defining the new committee and of providing the new committee sufficient time to bootstrap from the actual hand-over. As a result while steps 1 and 2 are happening there is no need for halting processing.

The time between 2 and 3 would need to stop finalisation of transactions that do not go through the consensus pipeline (single-owner objects).
This can be remedied either by going through the consensus pipeline even for these transactions or double-validating the transactions (which could be reduced to the normal operation if committees have sufficient overlap). The double-validation simply requires for a transaction issued after step 2 to get approval from the old committee (who is still authoritative over the state) and then get approval (by relaying the proof) from the new committee (which acts as a promise to not forget it and apply the moment they take over and before they accept any new transactions).

@LefKok
Copy link
Collaborator

LefKok commented Mar 22, 2022

Extending on the double-validation idea:

We can split the Sui single-object workflow in two parts: 1) Transaction Validation (TV) and 2)Certificate-Inclusion Promise (CIP). In normal operation TV is the consistent broadcast of the transaction and CIP is the persistent broadcast of the certificate (notice that consistent and persistent broadcast are the same protocol for different functionalities).

During the epoch change, the TV and the CIP responsibilities of the committee are handovered at different points in the process to the next committee. When the 2nd step (referred as bootstrap new epoch above) appears in consensus the new committee takes the CIP responsibilities, this means that any transaction TVed by the old committee need to get a CIP from the new committee, the old committee stops replying to these messages once it initiates the "epoch-handover" checkpoint.

When the epoch-handover checkpoint is available the new committee also bootstraps from this and takes over the TV duties. To start replying to TV messages committees members replay all the CIP messages they have acknowledged on their state and only afterward start processing TVs.

@asonnino
Copy link
Contributor

asonnino commented Mar 22, 2022 via email

@LefKok
Copy link
Collaborator

LefKok commented Mar 22, 2022

Yes it is the same pattern as a consistent broadcast but its goal is not to prevent equivocation, it is to guarantee inclusion in the next checkpoint (so it needs quorum intersection with the checkpoint mechanism).

@asonnino
Copy link
Contributor

asonnino commented Mar 22, 2022 via email

@LefKok
Copy link
Collaborator

LefKok commented Mar 22, 2022 via email

@asonnino
Copy link
Contributor

I see, so we have the overhead of making 2 extra round-trips for every transaction in order to support reconfiguration.

@LefKok
Copy link
Collaborator

LefKok commented Mar 22, 2022 via email

@asonnino
Copy link
Contributor

asonnino commented Mar 22, 2022 via email

@asonnino
Copy link
Contributor

asonnino commented Mar 22, 2022 via email

@gdanezis
Copy link
Collaborator Author

Just caught up on the double validation idea. I think this is viable, can we work out details and a more formal description + argument for safety? I get the intuition but the devil is often in the details. On the side of checkpoints, we will have them, no worries so steps 1, 2 in the post 15 days ago will be taken care of.

@gdanezis
Copy link
Collaborator Author

I closed the issue about adding epoch to transactions here: #74
We should work out if we should do this, according to the protocols here.

@LefKok
Copy link
Collaborator

LefKok commented Mar 31, 2022

Summed up the design and provided some proof sketches here: https://docs.google.com/document/d/1qFfT749PpNT20L8MhdNWt-9df90tIQE8M44W_zXzYA0/edit#

@bholc646 bholc646 removed this from the TBD milestone Apr 6, 2022
@gdanezis gdanezis added this to the TestNet milestone Apr 13, 2022
@gdanezis gdanezis added Type: Major Feature Major new functionality or integration, which has a significant impact on the network Priority: Critical This task must be completed or something bad will happen e.g. missing a major milestone, major bug Triaged Issue is associated with a milestone and has at least 1 label of each: status, type, & priority Status: In Progress Issue is actively being worked on labels Apr 13, 2022
@gdanezis gdanezis added sui-node and removed Type: Major Feature Major new functionality or integration, which has a significant impact on the network Priority: Critical This task must be completed or something bad will happen e.g. missing a major milestone, major bug Status: In Progress Issue is actively being worked on Triaged Issue is associated with a milestone and has at least 1 label of each: status, type, & priority labels Jun 17, 2022
@lxfind lxfind self-assigned this Jun 27, 2022
@lxfind lxfind modified the milestones: TestNet, [A] Private Testing Jul 2, 2022
@lxfind
Copy link
Contributor

lxfind commented Jul 2, 2022

Close this one as we have broken down the work to smaller issues

@lxfind lxfind closed this as completed Jul 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants