-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Conversation
…disabled + stripped KeyServerSet contract ABI
ATTENTION TO POTENTIAL REVIEWERS: the contents of this PR was merged in #7323 (which was built on top of this PR). I guess, because of confusion with |
Sorry :) |
Response to 'unresolved1': assume that KS that are 'leaving' are either offline, or malicious => they should be ignored in auto-migration session. Will do that in next PRs |
the pr looks good to me |
Just logging solution to unresolved#2 here: ignore this situation in auto-migration session and run it ti the end. So some keys can became irrecoverable after auto-migration session. Will fix this in another PR. |
see #7271 (comment)
depends on https://github.com/paritytech/contracts/pull/102
What is this PR about?
ServersSetChange session, which has been introduced recently, allows us to 'migrate' secret shares from servers_set1 to servers_set2, adding and removing shares wher required. It was designed to be started manually, using HTTP API by holder of some (administrator) key, all servers are configured for. So the scenario for running this session was (without details):
This PR is about minimizing number of steps ^^^ for the case when set of SecretStore key servers is configured using the special
KeyServerSetWithMigration
contract from here: https://github.com/paritytech/contracts/pull/102/files#diff-ebbb8d3338d9c365c0e8f5258894b5afR29 . When it is used, the scenario is narrowed to:MigrationCompleted()
event is fired [optional]KeyServerSetWithMigration contract
Currently single-owned version is presented in linked PR. Multi-owned (see notes on Kovan integration below) version will likely to be added later.
Contract holds 3 sets of nodes: current set, migration set && new set:
current set - it is the current state of SecretStore
new set - equal to current set +
addKeyServer
/removeKeyServer
are adding/removing nodes to/from heremigration set - is empty, unless migration is active
Migration represents the process of migrating secret shares from 'current set' to 'new set' (i.e. ServerSetChange session). It is started when
startMigration(id)
is called by one of the 'current set'+'new set' nodes. At the time of this call, nodes from 'new set' are copied to 'migration set'.Migration ends when all nodes from 'migration set' have confirmed that migration with
confirmMigration(id)
call. Migration id is just a random number to make sure that all nodes are running the same migration. After migration is completed, 'migration set' is cleared && everything starts over.IMPORTANT: there's special 'initialization' phase just after this contract is deployed. During that phase, all changes to 'new set' are simultaneously replicated to 'old set', thus making migration impossible. It is intended to resolve different configuration issues, like:
etc...
Initialization phase ends when
completeInitialization()
is calledKeyServerSetWithMigration listener
WARNING: new version of listener won't work with previous type of
KeyServerSet
contract. You should upgrade your contract to newer version (via registry).The main issue with listener was to synchronize three sets of sets of KeyServers:
ServersSetChange
(if any) is runningThis synchronization had to be re-entrant + survive unexpected nodes shutdowns/disconnects. It is achieved with the help of
KeyServerSetWithMigration
contract + idempotence ofServersSetChange
session.So here how it works:
ServerAdded
/ServerRemoved
event,KeyServerSet
component reads the state of servers set contract (snapshot)ConnectionTrigger
component checks the state of the snapshot. Without details:2.1) if the state is
Idle
(current_set == new_set && !migration
) => it just keeps us connecting to nodes fromcurrent_set
2.2) if the state is
MigrationRequired
(current_set != new_set && !migration
) => it checks if it should be 'the master of migration' && if so, calls thestartMigration()
method of the contract2.3) if the state is
MigrationActive
(migration == true
) => it:2.3.1) connects to all nodes from both
current_set
&&migration_set
2.3.2) checks if he's the master of migration && starts
ServersSetChange
session if requiredServersSetChange
session initialization message, it checks:3.1) that the sender is actually a migration master (read from the contract)
3.2) that migration has the same id (read from the contract)
3.3) that migration has the same migration set (read from the contract)
4.1) if the node is a part of new configuration (
migration_set
) && callsconfirmMigration()
if so4.2)
ConnectionTrigger
is back to connecting to only nodes fromcurrent_set
Known TODOs
startMigration
/confirmMigration
tx, we should wait at least for several blocks before resubmitting (see SecretStore: tx retry pattern #7323 )Unresolved important problems (any advice is appreciated)
MigrationCompleted()
is raised. If it is shut down, the migration will never end. Possible solutions:1.1) manual - count this node as lost && call comething like
KeyServerSetWithMigration.forceRemoval()
manually1.2) automatic - if during migration we can not connect to the node for T interval, call comething like
KeyServerSetWithMigration.forceRemoval()
automatically. When najority of nodes have calledforceRemoval
- remove it from migration1.3) ???
Both solutions ^^^ (1.1) && (1.2) are leading to possible secret loss (if after removal there are less than t+1 shares for the secret left).
And the same is the essence of the problem2:
ServersSetChange
session was designed to fail if there's at least 1 secret, which will have less than t+1 shares in new nodes set (i.e. the secret becames unrecoverrable). This is fine when we're running the session manually (we have a time for solution && restart), but when migration is started automatically, there's no one to choose a wise decision. I do not have a soution for this, but here are some thoughts on it:2.1) since threshold is specified by caller during secret creation, we can just ignore that error && leave secret unrecoverrable. This means that caller has invested more trust in SecretStore authorities than needed. Eventually the same situation could repeat even without auto-migration, when some authorities will just turn off their nodes.
2.2) there's a way (I believe so) to decrease threshold after shares were generated. So we might want to run threshold-decreasing session when this happens (see SecretStore: threshold decrease PoC #7562)
Notes on integration with Kovan authorities list contract
Since initial requirement was integrating SecretStore with Kovan here are some words on this:
This change is