Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ibc-transfer packet acknowledgements are broken #423

Closed
michaelfig opened this issue Feb 11, 2021 · 10 comments
Closed

ibc-transfer packet acknowledgements are broken #423

michaelfig opened this issue Feb 11, 2021 · 10 comments

Comments

@michaelfig
Copy link
Contributor

@colin-axner writes:

cc @michaelfig when merged, this should fix any trusted headers errors (since it will now query from the connected light client)

I tested #416 between my cosmos-sdk v0.41.0 chains (with --pruning=nothing), waiting until the genesis.app_state.staking.params.historical_entries value of 100 was long past, and I got a lot of consensus state failures:

$ rly tx transfer ibc0 ibc1 100n0token $(rly chains address ibc1)
[...]
I[2021-02-11|14:50:08.839] • [ibc1]@{492} - actions(0:update_client,1:recv_packet) hash(6AC12E40A1D0DA4A92685A102525B6A090D6B521A6CE502D1120FD7C6F868747) 
I[2021-02-11|14:50:09.356] - [ibc1]@{491} - try(1/5) query packet acknowledgement: portID (transfer), channelID (channel-0), sequence (1): invalid acknowledgement 
I[2021-02-11|14:50:10.672] - [ibc1]@{492} - try(2/5) query packet acknowledgement: portID (transfer), channelID (channel-0), sequence (1): invalid acknowledgement 
failed to execute message; message index: 1: acknowledge packet verification failed: packet acknowledgement verification failed: failed packet acknowledgement verification for client (07-tendermint-0): consensus state does not exist for height 0-493: consensus state not found: invalid request
failed to execute message; message index: 1: acknowledge packet verification failed: packet acknowledgement verification failed: failed packet acknowledgement verification for client (07-tendermint-0): consensus state does not exist for height 0-493: consensus state not found: invalid request
failed to execute message; message index: 1: acknowledge packet verification failed: packet acknowledgement verification failed: failed packet acknowledgement verification for client (07-tendermint-0): consensus state does not exist for height 0-493: consensus state not found: invalid request
failed to execute message; message index: 1: acknowledge packet verification failed: packet acknowledgement verification failed: failed packet acknowledgement verification for client (07-tendermint-0): consensus state does not exist for height 0-493: consensus state not found: invalid request
failed to execute message; message index: 1: acknowledge packet verification failed: packet acknowledgement verification failed: failed packet acknowledgement verification for client (07-tendermint-0): consensus state does not exist for height 0-493: consensus state not found: invalid request
E[2021-02-11|14:50:21.076] ibc0: err(failed to send packets, see above logs for details) 

IOW, the packet was relayed correctly but the ack was not returned. This is between two cosmos-sdk application/ibc-transfer instances, listening on port transfer. This phenomenon replicates what I found in my testing of dynamic IBC.

BTW, After this is sorted out, should we also encourage Gaia to default to the (old) 100 value for historical_entries instead of 10000?
https://github.com/cosmos/gaia/blob/7756f45641f6ee8cfabf83bfac85b688bae3713f/app/migrate.go#L164

@colin-axner
Copy link
Contributor

@michaelfig sorry for the confusion but #416 had a bug. #421 fixed this. Do you still see this error on master? (It's the same as the reported bug so I suspect it is fixed)

@colin-axner
Copy link
Contributor

BTW, After this is sorted out, should we also encourage Gaia to default to the (old) 100 value for historical_entries instead of 10000?

We could reduce the number to 1000. This would give an hour for connection handshake to complete. 100 might still be a little low, connection handshakes would have to be completed within 10 minutes of being initiated

@colin-axner
Copy link
Contributor

I'm noticing some indeterminacy in the tests (though it has been like this for a while). I suspect there is some very subtle bug. Let me know if you continue to run into issues using master. I will cut a v0.8.0 when I think the code has stabilized a little.

@colin-axner
Copy link
Contributor

colin-axner commented Feb 12, 2021

I think I know the issue, will try to post a fix. This is also something that can be remedied using auto updates (which has now been merged)

see #424 there may be remaining bugs but this should help

Edit: #425 is needed to stabilize the relayer. This was an existing problem, the recent refactors just helped reveal it to me

@michaelfig
Copy link
Contributor Author

BTW, After this is sorted out, should we also encourage Gaia to default to the (old) 100 value for historical_entries instead of 10000?

We could reduce the number to 1000. This would give an hour for connection handshake to complete. 100 might still be a little low, connection handshakes would have to be completed within 10 minutes of being initiated

I thought historical_entries was no longer used. Can you explain the dependency? And do you mean "channel handshakes" or "connection handshakes"?

@michaelfig
Copy link
Contributor Author

michaelfig commented Feb 12, 2021

I'm noticing some indeterminacy in the tests (though it has been like this for a while). I suspect there is some very subtle bug. Let me know if you continue to run into issues using master.

Please tag me when you'd like me to test again.

Current master (81dd43c):

I[2021-02-12|09:48:48.885] • [ibc0]@{817} - actions(0:transfer) hash(BF2063C7192DE89195B929AE088AA754885AE851B3ADA1624625A176571FF173) 
I[2021-02-12|09:48:49.888] - [ibc0]@{816} - try(1/5) query packet commitment: portID (transfer), channelID (channel-0), sequence (1): packet commitment not found 
failed to execute message; message index: 1: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): consensus state does not exist for height 0-818: consensus state not found: invalid request
failed to execute message; message index: 1: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): consensus state does not exist for height 0-818: consensus state not found: invalid request
E[2021-02-12|09:48:53.726] ibc0: err(light client: can't open light client database: resource temporarily unavailable) 
I[2021-02-12|09:48:53.726] • [ibc0]@{817} - actions(0:transfer) hash(BF2063C7192DE89195B929AE088AA754885AE851B3ADA1624625A176571FF173) 
failed to execute message; message index: 1: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): consensus state does not exist for height 0-818: consensus state not found: invalid request
rpc error: code = NotFound desc = account agoric12lr9fggsnu2ysqzjxscwlumumn065dpgdwpq0t not found: key not found
rpc error: code = NotFound desc = account agoric12lr9fggsnu2ysqzjxscwlumumn065dpgdwpq0t not found: key not found
failed to execute message; message index: 1: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): consensus state does not exist for height 0-818: consensus state not found: invalid request
rpc error: code = NotFound desc = account agoric12lr9fggsnu2ysqzjxscwlumumn065dpgdwpq0t not found: key not found
E[2021-02-12|09:49:00.040] ibc0: err(light client: can't open light client database: resource temporarily unavailable) 
rpc error: code = NotFound desc = account agoric12lr9fggsnu2ysqzjxscwlumumn065dpgdwpq0t not found: key not found
failed to execute message; message index: 1: receive packet verification failed: couldn't verify counterparty packet commitment: failed packet commitment verification for client (07-tendermint-0): consensus state does not exist for height 0-818: consensus state not found: invalid request
E[2021-02-12|09:49:01.264] ibc1: err(failed to send packets, see above logs for details) 
rpc error: code = NotFound desc = account agoric12lr9fggsnu2ysqzjxscwlumumn065dpgdwpq0t not found: key not found
E[2021-02-12|09:49:04.474] ibc1: err(failed to send packets, see above logs for details) 

@colin-axner
Copy link
Contributor

I thought historical_entries was no longer used. Can you explain the dependency? And do you mean "channel handshakes" or "connection handshakes"?

consensus states are objects stored on chain. Historical entries is not used, but each time we need to prove something on chain, we need to do so against a specific consensus state. Much of the code in this codebase separates construction of an update message and construction of the proof. This is problematic since we need the proof to be constructed against the update message. #425 would fix this but #424 might fix some issues in the short term

@colin-axner
Copy link
Contributor

A word of warning, retries for packet relaying will likely fail. I fixed it for handshakes but relaying retries will require a deeper refactor. Simply rerunning the command should work

@michaelfig
Copy link
Contributor Author

see #424 there may be remaining bugs but this should help

I tried 067f477, and it worked! Thank you so much, now I'll go on and test out dIBC.

@colin-axner
Copy link
Contributor

Awesome! @michaelfig that's great to hear. Closing this for now since #425 should hopefully tackle the major issues remaining

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants