Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demanding RPC request fails #17016

Closed
andrewheadricke opened this issue Jun 18, 2018 · 9 comments
Closed

Demanding RPC request fails #17016

andrewheadricke opened this issue Jun 18, 2018 · 9 comments

Comments

@andrewheadricke
Copy link

andrewheadricke commented Jun 18, 2018

Geth version: 1.8.11-stable
OS & Version: Linux

Expected behaviour

Return a list of events

Actual behaviour

Fails with error:

ERROR[06-18|21:06:33] write tcp 127.0.0.1:8545->127.0.0.1:46102: i/o timeout

I have a js/web3 script which does an event.get() on a token address which returns potentially hundreds of thousands of events.. It used to work fine in both Geth and Parity.. however I just tried with geth 1.8.11 and geth is giving an error

ERROR[06-18|21:06:33] write tcp 127.0.0.1:8545->127.0.0.1:46102: i/o timeout

and causing the script to fail due to returning zero events.

The script still works fine using Parity.

Has some sort of timeout on long running queries been implemented?

@Beasta
Copy link

Beasta commented Jun 18, 2018

I am having the same problem when running a script. I get the i/o timeout in the geth logs and the script returns this:

Error: Invalid JSON RPC response: "" 

@ryanschneider
Copy link
Contributor

Possibly related: #16880

What sort of timeouts are you seeing? After 5~10 seconds, or after 2 minutes?

Can you include the output of time curl ... of the same query on a working pre-.11 node?

@andrewheadricke
Copy link
Author

My query takes about 35-40 seconds, Geth is timing out at 20 seconds.

@Beasta
Copy link

Beasta commented Jun 20, 2018

The error for mine is very erratic and occurs sometimes quickly ( a few seconds), sometimes after a while, and sometimes not at all (takes ~30 seconds to complete).

I am running a large machine with >100 peers and have LightServ setting set to 90. Perhaps the inconsistency is due to varying amounts of LightServ activity?

@RaghavSood
Copy link
Contributor

We are seeing similar behaviour on 1.8.11 with personal_sendTransaction over RPC (same machine). It randomly fails with an EOF, and is noticeably slower than older versions.

It's a full, non-archival node, no LES.

@mikeshultz
Copy link

Sep 12 21:34:54 ip-w-x-y-z geth[16288]: ERROR[09-12|21:34:54.329] write tcp w.x.y.z:8080->a.b.c.d:52628: i/o timeout

Seen in 1.8.15 with eth_getLogs calls. Size of expected response varies. Really small block ranges will return results quickly but the range ends up being so small as to make this method almost unusable.

If I hammer it repeatedly, it does actually seem to return data eventually, however. It's hard to pin down. The same request sent repeatedly in succession will sometimes return and sometimes not without any noticeable pattern.

As requested from @ryanschneider, here's some timed calls. The response time doesn't always go down, but varies. Generally they timeouts are between 40-50s.

[mike@megan ~]$ time curl -w "%{http_code}" --data-binary '{"jsonrpc":"2.0","id":"666","method":"eth_getLogs","params": [{"address":"0x3818dC501b641905C83ebfF2ACd9f78fA6BD1341","fromBlock":"0x53ec60","toBlock":"latest"}]}' -H 'content-type:application/json' "https://mynode.net/";
<html>
<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<center><h1>502 Bad Gateway</h1></center>
</body>
</html>
502
real    0m53.244s
user    0m0.015s
sys 0m0.012s
[mike@megan ~]$ time curl -w "%{http_code}" --data-binary '{"jsonrpc":"2.0","id":"666","method":"eth_getLogs","params": [{"address":"0x3818dC501b641905C83ebfF2ACd9f78fA6BD1341","fromBlock":"0x53ec60","toBlock":"latest"}]}' -H 'content-type:application/json' "https://mynode.net/";
<html>
<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<center><h1>502 Bad Gateway</h1></center>
</body>
</html>
502
real    0m46.457s
user    0m0.022s
sys 0m0.004s
[mike@megan ~]$ time curl -s -w "%{http_code}" --data-binary '{"jsonrpc":"2.0","id":"666","method":"eth_getLogs","params": [{"address":"0x3818dC501b641905C83ebfF2ACd9f78fA6BD1341","fromBlock":"0x53ec60","toBlock":"latest"}]}' -H 'content-type:application/json' "https://mynode.net/";
<html>
<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<center><h1>502 Bad Gateway</h1></center>
</body>
</html>
502
real    0m38.434s
user    0m0.020s
sys 0m0.007s

@mikeshultz
Copy link

mikeshultz commented Oct 9, 2018

Saw Peter pointed out a potential solution on Reddit today that you guys might be interested in. I can't say why this started in a specific version(perhaps a performance issue?) but adjusting the write timeout allowed these requests to go through for me with 1.8.17(PR was merged for 1.8.13+).

Even with the adjusted timeouts, the requests generally respond before the default timeout. Though maybe that's just subjective observation with caching.

Here's the relevant section of my TOML:

[Node.HTTPTimeouts]
ReadTimeout = 30000000000
WriteTimeout = 120000000000
IdleTimeout = 120000000000

More details in this PR: #17240

@fjl
Copy link
Contributor

fjl commented Oct 10, 2018

BTW, you can use duration units in the config file :).

@stale
Copy link

stale bot commented Oct 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants