Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC nodes crash on query txs #11157

Closed
4 tasks
Jedi2002 opened this issue Feb 10, 2022 · 5 comments
Closed
4 tasks

RPC nodes crash on query txs #11157

Jedi2002 opened this issue Feb 10, 2022 · 5 comments

Comments

@Jedi2002
Copy link

Summary of Bug

Querying txs leads to a crash of RPC nodes. The same behaviour can be seen for full-nodes for different blockchains.

Version

v0.43

Steps to Reproduce

terrad query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node http://terra.sifchain.finance:26657 --output json --page 1

sifnoded query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node https://rpc.sifchain.finance --page 1


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@tac0turtle
Copy link
Member

what is the error? also do you encounter this on ibc queries or all of them?

@alexanderbez
Copy link
Contributor

Please post logs 🙏

@gzukel
Copy link

gzukel commented Feb 11, 2022

@alexanderbez There are no logs that I can see from the output.

terrad query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node http://terra.sifchain.finance:26657/ --output json --page 1

terrad query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node http://terra.sifchain.finance:26657/ --output json --limit 1

terrad query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node http://terra.sifchain.finance:26657 --output json --height 3948579

I've tried similar commands with Sifnode, Osmosis, Juno.

The only one to work so far has been Juno and it was intermittently. What is for sure is that it kills the RPC endpoint. The node its self stays running but the RPC crashes when this query is done.

I believe that is why there is no logs for it. I get the same errors from public URLS as well.

Logs from the terra node before it was killed by K8S for failing to respond on its RPC endpoint after 30 min of being down. I extended the time of the liveliness check to account for maybe just a long running http pull.

5:01PM ERR dialing failed (attempts: 1): dial tcp 3.14.82.67:26656: i/o timeout addr={"id":"4aee053268a227d97856129b6abe1970dab02ab6","ip":"3.14.82.67","port":26656} module=pex
5:01PM ERR dialing failed (attempts: 1): dial tcp 52.209.106.136:26656: i/o timeout addr={"id":"2e7c7bc133aa9f7642051466887cce1dde354aa0","ip":"52.209.106.136","port":26656} module=pex
5:02PM ERR dialing failed (attempts: 3): auth failure: secret conn failed: read tcp 10.0.57.103:52630->3.70.86.72:26656: i/o timeout addr={"id":"267962fadd05576299a4b9752caf8ada073d77bf","ip":"3.70.86.72","port":26656} module=pex
5:02PM ERR dialing failed (attempts: 1): dial tcp 15.165.158.42:26656: i/o timeout addr={"id":"d7cffa96a19d5e83fc694dec6ec6c19adddf7dcc","ip":"15.165.158.42","port":26656} module=pex
5:03PM ERR dialing failed (attempts: 2): auth failure: secret conn failed: read tcp 10.0.57.103:60454->54.176.230.251:26656: i/o timeout addr={"id":"0e3bbc050974ac14808a7d870f5a262954ff2cd4","ip":"54.176.230.251","port":26656} module=pex

When I run this against sifnode the RPC doesn't die but the node stops processing blocks and just hangs. You can call the rpc endpoint and it returns the status object but the blocks aren't being updated.

This was all working before, I used this same script and method to query the stuck IBC packets and trace transactions to the packet sequence ID on the channels being used. That way we could understand the money at risk for any stuck transaction.

terrad query txs --events message.action='/ibc.core.client.v1.MsgUpdateClient' --node http://terra.sifchain.finance:26657 --output json --height 3948579
Error: post failed: Post "http://terra.sifchain.finance:26657": EOF

No logs just locked node until it gets restarted because of the RPC being down.

This only started happening after the IBCV2 upgrade.

@alexanderbez
Copy link
Contributor

Interesting. So maybe there is contention or a dead-lock somewhere? @marbar3778 do you have any ideas?

@amaury1093
Copy link
Contributor

@Jedi2002 Can you explain why you are closing this issue (for posterity)? Did you find out where the panic came from?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants