-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Research]: Signer Coordinator Selection and Failover #39
Comments
I don't think it's a good idea to try and match deposit/withdrawal with block IDs, because we'll have scenarios where we'll be processing multiple deposits/withdrawals from different blocks. For failover, my understanding is that we haven't implemented this in Nakamoto, partially because we moved to "miner as coordinator" for blocks. I don't believe we've implemented coordinator failover for DKG, but I'm also not sure. |
Sure, we can revisit that approach but my concern was that it was never fully resolved. I assumed sequential batching but it makes sense if it's more realistic to batch/process requests from different blocks - especially after the withdrawal discussion since we moved from 1 block processing to 6 blocks. For failover behavior - which we currently don't have in Nakamoto for the DKG case, and it seems the backup is the same static coordinator selection if miner-as-coordinator can't be picked. |
This relates to the lock time needed for deposit UTXOs. If we switch coordinators every N blocks, but we have a locktime of N-M, all that's needed to fail is for a single signer to be offline for N blocks. That's totally OK, I think it just means that we need to have coordinator selection / fallback faster than our lock time. |
Nit: This is a signing round, not a DKG round. I believe we should handle coordinator selection separately for those two cases. |
The "separate chain view" issue is a problem for DKG, but not for signing rounds. In signing rounds everyone agrees on which bitcoin block a request is in - and therefore have a consistent number to base the VRF on. |
In sBTC v1, DKG is (probably) a manual process - so I am not too worried about that initially. Signing rounds need some automatic coordinator selection though. |
Right yeah, that's essentially what Option 3 says |
Possibly use Deposit API to select the coordinator. Possibly It turns out this is incredibly complicated |
Okay after our discussions in the meeting today I've been thinking more on this and I believe we could achieve a feasible and sensible solution for coordinator failover. RecapCoordinator selection is solved by the proposed VRF in 3. This gives us a unique coordinator While this is something we can probably find a workable solution, doing so would likely entail implementing some complex protocol like raft or paxos - or build on top of a system already implementing this like Zookeeper or Etcd. Any of these drags a lot of complexity in to the application. In addition, these systems are vulnerable to byzantine actors (might be acceptable for v1). Another path is to side-step the issue by leveraging the fact that Bitcoin already provides a point of synchronization. At each block, every signer has a consistent view of the underlying blockchains. This is how we solve coordinator selection in the first place in 3. ProposalLet there only be a single coordinator per bitcoin block, but let the next coordinator be responsible for any missed deposit and withdrawal requests in previous blocks. This means that if a coordinator is offline or failing to fulfill it's duties, we will have a delay in processing any deposit requests that this coordinator should handle. If we accept this potential delay in our system, we'll get in return the ability for all signers to agree on a single coordinator at any point in time. Note that it is still possible for the signers to have a divergent view of the underlying bitcoin blockchain. Two signers may simultaneously believe that they should coordinate a deposit request etc. However, all signers would unambiguously know which coordinator to ignore and which coordinator to respond to. For example, say we have bitcoin blocks
Footnotes
|
Yeah, and unless the transaction was explicitly signed to be an RBF (which is not the default), it would get rejected as soon as it's broadcasted Overall this sounds doable! As long as things like signing rounds can happen without every signer participating, I think this would lead to eventual consistency. |
Could we also use the current miner as a point of synchronization? This would allow us to piggyback off of whatever code is used to know who the current miner is. |
Since we don't want to break consensus for v1 we can't enforce any miner behavior, so unfortunately I don't think such a solution would be viable. |
Oh, I meant we just use the miner's ID, public key, or whatever unique identifier as the point of synchronization, but we do not involve the miner in the scheme in any other way. My thinking was that the signers must have some way of identifying that a Stacks block was from the right miner. Maybe that identifier is their public key or something, and we can just use that as input into our VRF selection process. This approach introduces some skew into the coordinator selection but it might be a little easier to implement. |
Right, that's technically correct. The miner is selected using a VRF on every bitcoin block that has at least one leader block commit from a miner candidate. Anyone observing bitcoin will be able to run the same VRF to determine the miner, and could use the public key of the miner to run yet another VRF to decide the coordinator. However, comparing the two options - simply running a VRF on the bitcoin block hash is much less complex and would always guarantee that we have a selected coordinator per block - while in the other case we'd have to parse all leader block commits, run two VRFs and we'd risk not having a new coordinator selected if no one is mining. |
Oh I didn't know that the signers were already observing bitcoin. With that in mind, using the miner's public key is more complicated than using the bitcoin block directly. |
Yeah! Maybe more specifically, Stacks nodes monitor Bitcoin, and signers are connected to a Stacks node, so this information is available. |
I believe the Signers will have to monitor Bitcoin directly themselves as well (or connect to a Bitcoin node or explorer). They need to:
|
We've agreed on the proposal of using a VRF based on Bitcoin block ID. Every bitcoin block will have a designated coordinator. If a coordinator does not process any requests, the next coordinator takes over. If two coordinators are requesting signing rounds for the same requests, the coordinator with higher block ID takes precedence. |
Completing the issue description and arriving at a conclusion is the deliverable of this issue.
Research - Signer Coordinator Selection and Failover
This ticket holds the research relating to Signer Coordinator Selection and Failover and how it impacts sBTC V1.
1. Summary
Stemming from issue #37, we concluded that it's best to have a VRF coordinator selection executed within the signers. It seems that we can mostly leverage the current coordinator code in stacks_signer with a slight difference of using the Bitcoin ConsensusHash that the user's request is tied to instead of the signer's view of the Bitcoin ConsensusHash from burnchain tip.
2. Context & Relevance
The need for a coordinator in signers exists in both the Nakamoto and sBTC releases as the introduction of signers as pivotal entities to process and approve certain operations requires synchronization and collective consensus.
In context of Nakamoto release, the signers run a DKG round to collectively agree on signing or rejecting blocks (using the DKG aggregate key). A coordinator is needed to trigger and finalize these commands on behalf of all the signers.
In context of sBTC, signers act as collective actors to govern the peg mechanism. Hence, they need to collectively agree on honoring or rejecting the peg request.
In Nakamoto, there were some iterations on how to calculate the coordinator:
/v2/info
RPC endpoint to fetch theConsensusHash
for the latest stacks tip to hash with signer public keys and pick the smallest corresponding ID.3. Research
Some options to consider are:
3.1.1 Use transaction ID for our VRF
This idea was brought up to prevent the issue in which signers have different views of the chain, although this would lead to a frequent coordinator selection per transaction, it will prevent transaction batching efforts and can be costly.
3.1.2 Miner as Coordinator
I don't think this can be used in sBTC V1 by itself since in the case of Nakamoto, the miner acts as coordinator, trying to get the signers to sign its block but in sBTC V1, the peg mechanism and its processing will happen separately and before block production. (could be very wrong here)
3.1.3 Calculate VRF using burnchain data tied to transactions
This is partly related to the 3.1.1 suggestion, but instead of calculating the VRF per transaction ID, we can use the Burnchain block info that the batched transactions are linked to as our VRF parameter. This way, the coordinator will change every time a new Burnchain block is produced which is not as frequent and batching can still be done.
We won't run into the issue of signers having different views of the burnchain because in this case, they won't hit the RPC endpoint and rely on their own view of the burnchain, but get that info from the transaction itself.
Note: Here I am making a big assumption that somehow the Bitcoin info can come with transaction info. For the case of deposit, triggered either from sBTC contract or Deposit API, we have to confirm the BTC transaction is made and is materialized on Bitcoin so that info should be calculated at some point. In the case of withdrawals, the latest discussions are leaning towards allowing withdrawals on burnchain to avoid fork issues...so I assume the block info would be accessible in this case as well.
3.1 Proposed Research Conclusions
Proposing option 3 for the reasons stated above but it relies heavily on the assumption that the Burnchain block data can be retrieved for both deposit and withdrawal transactions - ideally without extra overhead.
3.2 External Resources
3.3 Areas of Ambiguity
Closing Checklist
The text was updated successfully, but these errors were encountered: