-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICS33: Multi-hop Channel Spec #882
Conversation
Link to implementation branch: https://github.com/polymerdao/ibc-go/tree/ds/polymer-multihop Can view one of the tests (mutli-hop packet) here: https://github.com/polymerdao/ibc-go/blob/ds/polymer-multihop/modules/core/04-channel/keeper/multihop_packet_test.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review.
The overall concept is exciting! Though I'm concerned about a few areas. First, I think the handshake checks need to be stricter. I've commented on what needs to happen there
Also, verifyMultiHopProof
which is the crux of this proposal is not clearly speced. I think it needs to be defined as clearly as the generation functions so we can verify for correctness.
I did not look at the generation functions in detail for this round of review since the verification function is more important.
Handshake Validation (TRY and ACK):
- Prove that all intermediate connections are OPEN
- Prove that the connection hops from A->B are the inverse IDs of the connection hops from B->A
This will be possible by unmarshalling the intermediate connectionEnds stored in the MultiHopProof
Proof Verification:
- Prove that the client stored the provided connection state and consensus state of the first hop
- Iteratively move through the hops and prove that they stored the next connection state and consensus state against the root of the previously proven consensus state
- Prove that the source chain stored the desired key and value against the last proven consensus state
There is one issue with the proof verification that I'm not sure how to resolve as yet. At the moment, we don't have a way to prove that the intermediate consensus states are fresh. How can we be sure that we aren't using a consensus state that is forged using a stale validator set that is past its unbonding period during the intermediate steps?
Thinking as a first step could we verify the consensus state timestamp is not too far past our current time using the original clients trusting period? That might be an initial direction to consider
|
||
if (connectionHops.length > 1) { | ||
consensusState = provableStore.get(consensusStatePath(connection.ClientId, proofHeight) | ||
abortTransactionUnless(VerifyMultihopProof( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to pass in the key to the verification function here. If the proof is self-contained without passing in the key, this could be a valid multihop proof for some different key entirely.
Key and value must be passed in by caller to proof verification functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis should be a multihop proof, proving that the ChanOpenInit was stored on src chain
|
||
if (connectionHops.length > 1) { | ||
consensusState = provableStore.get(consensusStatePath(connection.ClientId, proofHeight) | ||
abortTransactionUnless(VerifyMultihopProof( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need a similar check on ChanOpenAck.
Given that we will have all the connection ends in the MultiHopProof, we should check at least in these handshake messages
That the connection hops from A->B are the inverse of the connection hops from B->A
This is possible if we have the connection end structs between A and B
Thanks for the review @AdityaSripal! I pushed some updates that should address most of your initial comments, and we're discussing internally about your higher level concern regarding forged consensus states in intermediate hops. The verification logic should have more details for you to take a look at in the meanwhile. Cheers! |
I have a few thoughts:
|
Implementation branch now up to date with ibc-go v7 here: https://github.com/polymerdao/ibc-go/tree/polymer/multihop-v7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I think the only major change necessary is to add a time check for each consensus state so we can be sure they are fresh. This should also happen for the source consensus state as well since it is not happening automatically anymore since you aren't using the clientstate verify method and instead using the proof method directly.
Approving so it isn't blocked on a re-review
3. Starting with known `ConsensusState[N-1]` at the given `proofHeight` on `ChainN` prove the prior chain's consensus and connection state. | ||
4. Repeat step 3, proving `ConsensusState[i-1]` and `Conn[i-1,i]` until `ConsensusState0` on the source chain is proven. | ||
- Start with i = N-1 | ||
- ConsensusState <= ConsensusState[i-1] on Chain[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have access to all the intermediate consensus states, we should definitely check that the timestamp in the consensus state is still within the trusting period of the final client. This way we can ensure we are using fresh state
|
||
// VerifyMultiHopConsensusStateProof verifies the consensus state of each consecutive consensus/connection state starting | ||
// from chain[N-1] on the destination (chain[N]) and finally proving the source chain consensus/connection state. | ||
func VerifyMultiHopConsensusStateProof( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name this better since it is proving connection state and consensus state at each hop
abortTransactionUnless(len(proofs.ConsensusProofs) >= 1) | ||
abortTransactionUnless(len(proofs.ConnectionProofs) == len(proofs.ConsensusProofs)) | ||
|
||
abortTransactionUnless(VerifyMultiHopConsensusStateProof(consensusState, proofs.ConsensusProofs, proofs.ConnectionProofs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to understand this was correct.
There are two consensus states here.
consensusState
is the consensus state stored on the destination chain to prove the first hop in the multiHop chain.
consState
is the consensus state stored on the first chain in the multiHop link used to prove the final value.
These should be better named so its easier to understand the difference and where they are used
for i := len(consensusProofs) - 1; i >= 0; i-- { | ||
consStateProof := consensusProofs[i] | ||
connectionProof := connectionProofs[i] | ||
conState := abortTransactionUnless(UnmarshalInterface(consStateProof.Value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this down to line 334 for understandability
Ignore the time check comment above, after discussion realized it wasn't necessary. VerifyMembership doesn't do a time check (outside of delayPeriod), only UpdateClient does |
previousChain := chains[i] // previous chains state root is on currentChain and is the source chain for i==0. | ||
currentChain := chains[i+1] // currentChain is where the proof is queried and generated | ||
nextChain := chains[i+2] // nextChain holds the state root of the currentChain | ||
|
||
currentHeight := GetClientStateHeight(currentChain, sourceChain) // height of previous chain state on current chain | ||
nextHeight := GetClientStateHeight(nextChain, currentChain) // height of current chain state on next chain | ||
|
||
// consensus state of previous chain on current chain at currentHeight which is the height of A's client state on B | ||
consensusState := GetConsensusState(currentChain, prevChain.ClientID, currentHeight) | ||
|
||
// prefixed key for consensus state of previous chain | ||
consensusKey := GetPrefixedConsensusStateKey(currentChain, currentHeight) | ||
|
||
// proof of previous chain's consensus state at currentHeight on currentChain at nextHeight | ||
consensusProof := GetConsensusStateProof(currentChain, nextHeight, currentHeight, currentChain.ClientID) | ||
|
||
proofs = append(consStateProofs, &ProofData{ | ||
Key: &consensusKey, | ||
Value: consensusState, | ||
Proof: consensusProof, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just started to look at this and this part is confusing to me. Not sue we can use currentHeight
and nextHeight
as specified. Maybe I am missing something but iiuc the design, the relayer needs to perform client updates on all chains and collect the proofs with the exact update heights as previous chain proof height.
In other words, in the diagram above should it be ConsensusState[i-1](@Height == H[i-1]
? Otherwise the proofs will not verify (??)
Let's take the A--B--C--D
example where a new packet is sent to A
:
- the relayer gets the packet event at height
Ha
- queries chain A for the commitment proof (proof height is
Ha
), let's call itA: KeyProof(commitment, Ha)
- queries chain A for the commitment proof (proof height is
- updates the client of A on B with the header at height
Ha
- the update is stored on chain
B
atHb
andconsensus[A](@Ha)
is stored on B - relayer can query chain B for proof of at height
Hb
and obtains:B: ConsensusProof(consensus[A](@Ha), Hb)
- the update is stored on chain
- updates the client of B on C with the header at height
Hb
- the update is stored on chain
C
atHc
andconsensus[B](@Hb)
is stored on C - relayer can query chain C for proof of at height
Hb
and obtains:C: ConsensusProof(consensus[B](@Hb), Hc)
- the update is stored on chain
So the relayer has three proofs claiming:
A: KeyProof(commitment, Ha)
- a packet with commitment
exists on A
B: ConsensusProof(consensus[A](@Ha), Hb)
- there is a consensus state on B (stored at Hb
) for same height as the commitment proof height (Ha
)
C: ConsensusProof(consensus[B](@Hb), Hc)
- there is a consensus state on C (stored at Hc
) for the same height as the previous chain consensus proof height (Hb
)
At this point the relayer can build a Tx with MsgUpdateClient(header(@Hc)
for the client of C on D +MsgRecvPacket
with all the above proofs (plus the connection proofs below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed some changes above, currentHeight -> consensusHeight
and nextHeight -> proofHeight
prefix = counterpartyConnectionEnd.GetCounterparty().GetPrefix() | ||
abortTransactionUnless(VerifyMultihopProof(k.cdc, consensusState, channel.ConnectionHops, proof, prefix, key, commitment)) | ||
} else { | ||
abortTransactionUnless(connection.verifyPacketData( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was looking at what we're missing for multi-hop by not calling this (not using the client security model):
- clients on intermediate chains are not active
- not checking delay for packet delay feature
Same checks are skipped in other packet handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for channel handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the active client is a great point. If an intermediate client gets frozen for misbehaviour, we won't detect it here. This could cause a proof to succeed against the forked consensus state. We should definitely be checking that all intermediate clients are active.
I mentioned the packet delay feature to the team. I think it makes sense to only allow packet_delay_time
to be specified in this case, and for the intermediate consensus states to have their timestamps checked against the local clock of the final destination chain. Rather than checking against all the intermediate BFT clocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned the packet delay feature to the team. I think it makes sense to only allow packet_delay_time to be specified in this case, and for the intermediate consensus states to have their timestamps checked against the local clock of the final destination chain. Rather than checking against all the intermediate BFT clocks
Ok just to clarify, with multihop we only take into account the last connection delay and the pseudocode above will change to perform some version of verifyDelayPeriodPassed
, in addition to the checks you mention.
So a relayer would send all client updates, collect the proofs and, if the last connection delay is non-zero, wait for the delay of the last hop before sending the packet.
Regarding the checks you mention, we need to allow for clock drift and block time variations. There is also an accumulated delay for the client update on the destination chain since for every hop the relayer needs to wait for the client update to be sent and consensus state to be stored on chain before we do the next chain update. So the timestamp of consensus state on chain i
may be blockTime(i)
older than the consensus state on chain i+1
. This may be even higher if the update is not included in the "next" block.
Note: even with zero delay, packets will be effectively relayed with a delay of sum(blockTime(i), i = 1..N-1)
Not sure if all this is accurate, just some thoughts and you may have a different view or I may miss something obvious. To clarify any confusion, could we add more english explanations to the specs to make things clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok just to clarify, with multihop we only take into account the last connection delay and the pseudocode above will change to perform some version of verifyDelayPeriodPassed, in addition to the checks you mention.
Or do we delay the packet/ client updates with the sum of the delays of all previous connections? Each hop delay is available to the destination chain so it could verify that consensus state timestamps of adjacent chains respect the connection delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that more English explanations in addition to the pseudocode would be very useful.
Ok just to clarify, with multihop we only take into account the last connection delay and the pseudocode above will change to perform some version of verifyDelayPeriodPassed, in addition to the checks you mention.
So a relayer would send all client updates, collect the proofs and, if the last connection delay is non-zero, wait for the delay of the last hop before sending the packet.
It would wait such that that the client updates of all intermediate hops have at least passed the delay period specified at the final hop. Yes, I think in practice this would mean waiting for last client update to pass the delay period most of the time. But the point is the delay period parameter specified on the conneciton end will be enforced against every intermediate consensus state using the BFT clock of the final proving chain.
i.e. it should fail if the last consensus state is passed the delayPeriod but the first consensus state is not. This can happen if the later hops clients were more recently updated by other relayers and only the first couple hops needed to have updates filled in to complete the route.
Regarding the checks you mention, we need to allow for clock drift and block time variations. There is also an accumulated delay for the client update on the destination chain since for every hop the relayer needs to wait for the client update to be sent and consensus state to be stored on chain before we do the next chain update. So the timestamp of consensus state on chain i may be blockTime(i) older than the consensus state on chain i+1. This may be even higher if the update is not included in the "next" block.
I don't believe this is as relevant. Perhaps in order to route a packet, a relayer may have to submit client updates on some intermediate hops. But if all the hops are up-to-date, a relayer can simply query proofs from each chain and submit them. So the minimum delay is negligible, and the maximum delay as you said is sum(blockTime(i))
The purpose of the block delay period is to ensure that there is some grace period for detecting misbehaviour before we use the consensus state for proof verification.
The clock drift is not cumulative across the path, instead the effective clock drift on this connection will be the maximum drift between any two chains on this route.
So supposing we wanted to allow for a grace period of time n
for relayers to submit misbehaviour against a consensus state. We would want the delay period to be something like:
delayPeriodTime = n + sumBlock(i) + max_clock_drift
In practice, I think in order for n
to be a useful grace period at all, n >> sumBlock(i) + maxClockDrift
.
So rather than explicitly accounting for this in the protocol, I think if a user wants delayPeriod protection they should just set the value to be large enough to account for any of the above factors. This should be noted briefly in the spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh wait, I just realized the delayPeriod is implemented to be enforced against time of addition not the actual timestamp of the consensus state. So it is not easy to do this in the multihop case...
Will have to think about it some more
Especially since we do not include time of addition inside the consensus state itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can happen if the later hops clients were more recently updated by other relayers and only the first couple hops needed to have updates filled in to complete the route.
This statement and other comments in the spec suggest that I am missing something very basic. How can this be? If you update a client at chain i
before you update the client at chain i-1
the recursive proof at chain i
will fail, no? Since the commitment root stored is for an app state on i-1
that did not include the the expected commitment (consensus state for i-2
).
I thought that the consensus state on chain i
must be created after the one in i-1
, otherwise the proof chain is broken. Is this not true? Also, the commitment proof on chain i-1
must be taken at the exact height as the height of the consensus state (that provides the commitment root) on chain i
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh wait, I just realized the delayPeriod is implemented to be enforced against time of addition not the actual timestamp of the consensus state. So it is not easy to do this in the multihop case...
Will have to think about it some more
Especially since we do not include time of addition inside the consensus state itself
Ok but if:
- the destination chain
N
verifies the delay against the time it stored the consensus state forN-1
, and - it is true that
consensusState[i]
must have been stored afterconsensusState[i-1]
for proofs to verify (see my last questions in ICS33: Multi-hop Channel Spec #882 (comment))
then the delayPeriod enforcement is also proven
commitmenttypes.GetSDKSpecs(), | ||
consensusState.GetRoot(), | ||
*connectionProof.PrefixedKey, | ||
connectionProof.Value, // this should be from connectionHops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear to me, do we check anywhere that the consensus states are from clients associated with the actual connections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we check each connection is OPEN and matching the connections specified in the channel connectionHops field here.
We also check that each of those provided connectionEnds are within the provided consensus states here
At a hight level we prove:
- The connectionHops used to establish the channel match the connections provided in the multihop proof and are OPEN
- the connectionEnds are proven within the provided consensus states
- the consensus states are proven recursively starting from the trusted consensus state on the receiving chain.
Does that suffice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two clients may have the same consensus state. So I was looking for an explicit check that connectionEnd.ClientId
is the same as the clientId
for which the consensus proof was collected, i.e. same as the ID in its key path. Check would be something like:
consensusProofClientId = strings.Split(consensusProof.PrefixedKey.GetKeyPath()[len(consensusProof.PrefixedKey.KeyPath)-1], "/")..
abortTransactionUnless(connectionEnd.ClientId == consensusProofClientId)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we will likely need to bundle clientStates with the multihop proof as well per offline discussion. In this case we should definitely verify the clientid of the clientstate matches the consensus/connection state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ancazamfir also, FYI, I'm not sure if you saw this. Here is the offline doc summarizing the review comments: https://docs.google.com/document/d/1lFxGZwapWANirgCO3NthsDGHgmBdSo_71alj14rAhnU/edit#
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to drop comments there and we can enshrine the decision in GH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i will take a look and provide comments there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To summarize the offline discussion and google doc collaboration:
- delayPeriod will be enforced by computing the maximum delayPeriod across all connections in the channel path and enforcing that delayPeriod on the final consensusState on the verifier. This should suffice because by construction all recursively embedded consensus states must have existed for at least as long as the final consensus state.
- Client states are NOT required to be bundled with the multi-hop proof. Instead a new chanCloseFrozen() method will be introduced to allow relayers to post a multi-hop proof of a frozen client state starting from the misbehaving chain in the channel path. The proof is not symmetric, but relayers can send a unique proof (representing the hops from each end to the misbehaving chain) to actively freeze the channel.
- Frozen channels can not process packets and a new FROZEN state could be introduced to help distinguish this situation.
- If a client is frozen in the channel path then it would not be possible to send newly committed packets as the consensus state associated with the frozen client would not be updatable making it impossible to create new multi-hop proofs.
Hopefully I represented that correctly and didn't miss anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass for some files. We need to cleanup a bit and fix some consistency issues. I also think we need more clarity on the channel freeze section, left a couple of questions there.
abortTransactionUnless(height01 >= proofHeight) // ensure that chain0's client state is update to date | ||
|
||
// query the key/value proof on the source chain at the proof height | ||
keyProof, _ := chain0.QueryProofAtHeight([]byte(key), int64(proofHeight.GetRevisionHeight())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is QueryProofAtHeight
defined in any ICS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is part of the ibctesting
framework (chain.go) though I'm not sure if it is defined in any ICS. Do you suggest explicitly defining it in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I was curious is there is one. Maybe just declare the primitives in a prior section. This could also be simplified to pseudocode, smthg like chain0.QueryProofAtHeight(key, proofHeight)
or QueryProofAtHeight(chain0, key, proofHeight)
function verifyMultihopMembership( | ||
connection: ConnectionEnd, | ||
height: Height, | ||
proof: MultihopProof, | ||
connectionHops: String[], | ||
key: String, | ||
value: ArrayBuffer) { | ||
multihopConnectionEnd = abortTransactionUnless(getMultihopConnectionEnd(proof)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some comments here and clarify in particular what connection
is vs the multihopConnectionEnd
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
channel.state = CLOSED | ||
provableStore.set(channelPath(portIdentifier, channelIdentifier), channel) | ||
} | ||
``` | ||
|
||
The `chanCloseFrozen` function is called by a relayer to force close a multi-hop channel if any client state in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a separate pseudocode on how the proofs are built? I think they will look differently depending to which channel end they are sent to. In the example A--B--xC--D (where x is the frozen client for B hosted on C), A cannot receive but could sent while D cannot sent but could receive.
I would expect then some logic below to determine if the client is for one of the connection hops or the counterparty of a connection hop. And basically find the C-B-A connection hops from the A--D channel connection hops when the close is received by A.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your example, the frozen client proof would look like this:
- Multi-hop consensus/connection proof from C-->A. This is the same as a regular multi-hop proof without the key/value proof.
- key/value proof on
chain B
. For a frozen client state we need to know the key of the frozen client state on chainB, but this is actually tricky since we don't know the client ID. This is why we need the multi-hop proof in step 1 to include the connection state on the misbehaving chain C. The connectionEnd counterparty contains the client id on chain B we can use to derive the key for the client state on chain B. The client state value is complex and is provided with the proof.
I realize we never talked about the implementation details for frozen client proofs, but do you see any issues with this approach? It is efficient in some sense because it allows re-using the same multi-hop proof struct with an additional index
parameter for frozen clients and allows use of a single multi-hop proof. The index
is used to generalize the multi-hop proof so that the key/value can be proven on any consensus state in the mutli-hop channel path (in this case the client state proven on chainB instead of chainC).
Although similar to standard multi-hop proving, it does have some distinct differences. I'll add pseudo code for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand 2. If we look at the chanCloseFrozen
sent to A
So for the case A--B--xC--D
: The connection proof from 1 should have the B-end of the connection B--C. Once verified, we can get the client ID on C (for the B--C connection) from the counterparty of the B-end. That should match the client for the key/value proof.
In contrast let's take the case A--Bx--C--D
: In this case the key/value proof should be for the client of the B-end of the connection and not the counterparty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, sorry, my example didn't follow your original case - I was assuming that chainC
was the misbehaving chain with the frozen client on chainB
.
Yes, the way you explained it is exactly how I was thinking. I'll push the pseudo code shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a pseudocode on how the proofs are constructed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like FrozenProofGeneration is not specified.
Mostly reviewed and left final suggestions on the state machine code.
Haven't reviewed the proof generation code in great detail yet.
Great work!
} | ||
|
||
// Return the maximum delay period across all connections in the channel path. | ||
function getMaximumDelayPeriod(proof: MultihopProof, lastConnection: ConnectionEnd): number { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return two numbers. The delayPeriodBlock and delayPeriodTime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, the block delay is computed below, but I'll refactor this to return timeDelay and blockDelay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this done?
)) | ||
"", counterpartyHops, counterpartyVersion} | ||
|
||
if (connectionHops.length > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice to abstract this away from the channel layer. Ideally ICS4, just calls the VerifyChannelState
function and then the connection looks at the connectionHops and decides whether it needs to verify a MultiHopProof or a single proof
Think we'll want that for sure in the implementation anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, there are a few annoying challenges to doing this in the actual implementation, but I'll see if I can simplify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the preferred way to do this might be to first standardize the calling interface for all the single-hop verification calls to collapse them to VerifyMembership/VerifyNonMembership and then add a switch statement in the implementation to handle multi-hop separate from single-hop, but this seems like non-trivial work.
height: Height, | ||
proof: MultihopProof, | ||
connectionHops: String[], | ||
key: String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
abortTransactionUnless(connection.state === OPEN) | ||
|
||
// connectionEnd on misbehaving chain representing the counterparty connection on the next chain | ||
let counterpartyConnectionEnd = proofFrozen.ConnectionProofs[0].Value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need ot check that this connectionEnd is indeed in our connectionHops for this channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it is checked in connection.verifyMultihopMembership
which eventually checks the connection hops under the hood.
@@ -84,7 +84,7 @@ interface ChannelEnd { | |||
- The `nextSequenceSend`, stored separately, tracks the sequence number for the next packet to be sent. | |||
- The `nextSequenceRecv`, stored separately, tracks the sequence number for the next packet to be received. | |||
- The `nextSequenceAck`, stored separately, tracks the sequence number for the next packet to be acknowledged. | |||
- The `connectionHops` stores the list of connection identifiers, in order, along which packets sent on this channel will travel. At the moment this list must be of length 1. In the future multi-hop channels may be supported. | |||
- The `connectionHops` stores the list of connection identifiers, in order, along which packets sent on this channel will travel. A list length is greater than 1 indicates a multi-hop channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connectionHops
appears in ChanOpenInit
and ChanOpenTry
. I think we should clarify that the order is starting from the receiving chain towards the other end, i.e. connectionHops[0]
is the end on the chain receiving this message.
In this example:
A (a) --- (b1) B (b2) --- (c) C
for OpenInit
we will have [a, b2]
for OpenTry
we will have [c, b1]
See also this line that appears in ChanOpenInit
and ChanOpenTry
:
connection = provableStore.get(connectionPath(connectionHops[0]))
I would also suggest doing the same for proofs, i.e. reverting the order in the proof collection pseudocode (prepend
instead of append
) so they appear in the same order as the connectionHops
.
At the moment in VerifyConnectionHops
it seems that the connectionProofs
are in the reverse order of the connectionHops
.
Alternatively we can change the ChanOpenTry
to have the connectionHops
in this order in the example:
[c, b1]
and get the connection like this:
connection = provableStore.get(connectionPath(connectionHops[n-1]))
(lower preference for me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, added some info on the connectionHops and updated the proof ordering per your suggestion.
connection, | ||
proofHeight, | ||
proofInit, | ||
connectionHops, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this? it's been computed above as:
counterpartyHops = getCounterPartyHops(proofInit)
and we pass proofInit
above.
Same question for chanOpenAck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean do we need the connectionHops
argument if we already compute counterpartyHops
? The connectionHops
passed into ChanOpenTry is the connectionHops from Receiver --> Sender while the counterpartyHops
are the computed hops from Sender --> Receiver used to create the expected channel for the proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry, i misread the code, all good!
// connectionEnd on misbehaving chain representing the counterparty connection on the next chain | ||
counterpartyConnectionEnd := abortTransactionUnless(Unmarshal(proof.ConnectionProofs[0].Value)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first look this doesn't seem to work imho. Could we have a section going through an example? I think it would be useful as the proof generation in general is complex and for this case in particular.
So again let's take this example with connection IDs showing. Assume the client used by bc
is frozen
A (a) --- (ba) B (bc) xx --- (cb) C (cd) --- (dc) D
So we call:
GenerateFrozenClientProof([B, A])
-> proofs will be sent to A
- In this case A will get connection proofs
[bc, a]
and will take the client ofbc
.
GenerateFrozenClientProof([B, C, D])
-> proofs will be sent to D
- D will get proofs for
[cb, dc]
but should use the client of the counterparty ofcb
.
So 1. and 2. don't look the same wrt to proof collection logic. Same for verification in chanCloseFrozen
.
Intuitively I was expecting some kind of indication of the "direction" being froze, not sure we can generalize for both 1 and 2. But I might be proven wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your assessment is correct. We made a decision to only allow the proof to work on the side with the frozen client (proof logic to A in step 1 from your example). We came to the same conclusion that it would be hard to generalize though this is certainly open for consideration.
The consequence of this choice is that in your example above only the channel on chain A could be forcefully frozen. In order for the channel on chain D to be frozen dc
would need to be frozen (although in this example it would not be necessary since chain D would already see the frozen client state).
So in order to freeze both ends of the channel, evidence must be submitted to freeze the client state on both sides of the misbehaving chain. This way the proof logic is always the same, and the tradeoff is that the channel can only be frozen on the side of the channel path containing the frozen client.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(A clarification that in my example those IDs were for connections but yes, we have similar for clientIDs)
When we discussed this offline, I was asking the same question and feedback from @AdityaSripal was that we should close both ends. I agree with that, we should close both ends.
We cannot freeze a client on the other "side" as that client/ connection may be valid and used by other single or multi-hop channels.
I think we can make this function work if we add an explicit flag, e.g. i
(incoming) to the last proof. By default proofs are o
, i.e. outgoing.
(Note that I flipped the proofs below as well, i.e. they can be verified from first to last)
So for 2. the proof could be something like [dc, cb, bc(i)]
vs for 1. [a, bc]
If the last proof is marked i
then the client on the incoming/ counterparty connection end is frozen, i.e. bc
must be for the counterparty of cb
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one question with this approach. If ClientStateC
on ChainB
(represented as bc
) is frozen, would that be provable on ConsensusStateB
on ChainC
? Once the client is frozen there will not be any more consensus updates so would ConsensusStateB
be frozen just prior to the freeze or would that state be included in the state?
That said, I went back and looked at the implementation for the ChanCloseFrozen proof, and I actually think the existing logic can support channel freezing on both ends of the channel with very small changes by using the new KeyProofIndex
field of the MultihopProof struct to also indicate the directionality of the proof. Going back to your example above:
A (a) --- (ba) B (bc) xx --- (cb) C (cd) --- (dc) D
-
For A, its
multihopConnectionEnd
isbc
. The ClientID for ClientBC should bemultihopConnectionEnd.Counterparty.ClientD
and we can use this logic to determine the client state key if theKeyProofIndex != 0
-
For D, its
multihopConnectionEnd
iscb
. The ClientID for ClientBC should bemultihopConnectionEnd.ClientID
and we can use this to determine the key for the client state ifKeyProofIndex == 0
KeyProofIndex
is a new field introduced to keep the multi-hop verification logic the same while allowing verification of keys within intermediate consensus states in the proof chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one question with this approach. If
ClientStateC
onChainB
(represented asbc
) is frozen, would that be provable onConsensusStateB
onChainC
? Once the client is frozen there will not be any more consensus updates so wouldConsensusStateB
be frozen just prior to the freeze or would that state be included in the state?
Not sure I understand. If the client for bc
is frozen at height hb
we can still update the client for cb
on C with a consensus state of chain B at a height hb', hb'>hb
. Then we can query B's client for bc
with proof at height hb'
.
That said, I went back and looked at the implementation for the ChanCloseFrozen proof, and I actually think the existing logic can support channel freezing on both ends of the channel with very small changes by using the new
KeyProofIndex
field of the MultihopProof struct to also indicate the directionality of the proof. Going back to your example above:
A (a) --- (ba) B (bc) xx --- (cb) C (cd) --- (dc) D
- For A, its
multihopConnectionEnd
isbc
. The ClientID for ClientBC should bemultihopConnectionEnd.Counterparty.ClientD
and we can use this logic to determine the client state key if theKeyProofIndex != 0
shouldn't be multihopConnectionEnd.ClientID
in this case? and ``KeyProofIndex == 0`
- For D, its
multihopConnectionEnd
iscb
. The ClientID for ClientBC should bemultihopConnectionEnd.ClientID
and we can use this to determine the key for the client state ifKeyProofIndex == 0
shouldn't be multihopConnectionEnd.Counterparty.ClientD
and KeyProofIndex != 0
KeyProofIndex
is a new field introduced to keep the multi-hop verification logic the same while allowing verification of keys within intermediate consensus states in the proof chain.
So iiuc, KeyProofIndex
is non-zero in case 2 of proofs for closing channels due to on-path client being frozen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both sides should be frozen using their respective clients. So in the example you have above, C is malicious.
So you use the frozen client (bc) to freeze channel on A. And as you alreaady mentioned, D will be automatically disabled when (dc) is frozen.
If C is malicious, both (bc) and (dc) can be frozen with the same misbehaviour.
Let's look at a more complicated example.
A (ab) --- (ba) B (bc) xx --- (cb) C (cd) --- (dc) D (de) --- (ed) E
Now suppose again that chain C has become malicious.
Here, I would say that a relayer should first freeze the clients of C on the direct counterparties. Namely, (bc) and (dc) should be frozen.
Then the channel should be closed on both ends using a MultiHop Proof
We need a proof for A that (bc) is frozen and a proof for E that (dc) is frozen.
I don't think it makes sense to prove the same client for both sides.
We want to close both sides, but for each end we should use the client that is relevant to that end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this when reviewing and got stuck exactly here...
Here, I would say that a relayer should first freeze the clients of C on the direct counterparties. Namely, (bc) and (dc) should be frozen.
hermes currently submits evidence only once a client is updated. If bc
client is updated (by some other relayer) hermes compares the header used in the update with the header it gets from the C full node it connects to. Then it submits MsgMisbehaviour
but to bc
only.
The relayer could try to freeze all clients of C as you point out. But there is no guarantee that this works, in the example, the header that updated bc
might not be verifiable on the dc
client because the clients may have different security parameters (specifically the trusting level). So in theory we have a small probability channel on E will not be closed because we cannot freeze dc
.
Can we live with this in practice? It would simplify the spec quite a bit, we could eliminate KeyProofIndex
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the later discussion, I will keep the logic as is. Channel freezing will require a relayer to freeze the client on each side of the channel and generate a separate frozen client proof for each channel end.
5. Verify the intermediate state proofs. Starting with known `ConsensusState[N-1]` at the given `proofHeight` on `ChainN` prove the prior chain's consensus and connection state. | ||
6. Verify that the client id in each consensus state proof key matches the client id in the ConnectionEnd in the previous connection state proof. | ||
7. Repeat step 3, proving `ConsensusState[i-1]`, and `Conn[i-1,i]` until `ConsensusState[0]` on the source chain is proven. | ||
- Start with i = N-1 | ||
- ConsensusState <- ConsensusProof[i-1].Value on Chain[i] | ||
- ConnectionEnd <- ConnectionProof[i-1].Value on Chain[i] | ||
- Verify ParseClientID(ConsensusProofs[i].Key) == ConnectionEnd.ClientID | ||
- ConsensusProofs[i].Proof.VerifyMembership(ConsensusState.GetRoot(), ConsensusProofs[i].Key, ConsensusProofs[i].Value) | ||
- ConnectionProofs[i].Proof.VerifyMembership(ConsensusState.GetRoot(), ConnectionProofs[i].Key, ConnectionProofs[i].Value) | ||
- Set ConsensusState from unmarshalled ConsensusProofs[i].Value | ||
8. Finally, prove the expected channel or packet commitment in `ConsensusState[0]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments should be updated to reflect that first proof is for the receiving chain and last for the source (or misbehaving) chain.
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Aditya <adityasripal@gmail.com>
…op proof generation; use "proofHeight" instead of "keyHeight" for consistency
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
…d proof gen helper function descriptions
…o match connectionHops (receiver --> sender)
…the connectionHops ordering (receiver --> sender)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing job! Lot of thought, tedious work and patience got into this complex spec, really appreciate it!
Excellent work to everyone involved!! 🎉 🎉 🎉 |
Adds a spec for multi-hop channels. Viewable at https://github.com/polymerdao/ibc/tree/bo/ics-32/spec/core/ics-033-multi-hop.
Opening for discussion.
TODOs: