-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client waiting endlessly for info from unresponsive server #634
Comments
@soulrebel What node.js and ioredis versions? |
@soulrebel Is it possible you're seeing #633 ? |
The default |
@brettkiefer Node v 6.0.11 and ioredis 3.2.2. Not the same bug. @luin Retrying forever is ok, but I have a log on the error event and it never fires in this case. It simply waits for the server to respond, but that never happens and there are no retries. I think something like this could happen anytime a server goes offline while processing a command. The reply will never come, the socket will not timeout since the outgoing data is sent and ioredis will wait indefinitely. |
I came across the same problem, and I agree that the client can retry to connect the server on its own. But should this kind of retrying be separated from every individual client's request? Every request can has its own timeout and throw a timeout error: try {
await redis.keys('user*')
} catch (err) {
// err: redis request timeout for 3000ms
} Otherwise the program will be forever pending, and we have no way to kill this kind of pending. |
@livoras I don't believe that for 99.999% of the cases you would need different timeouts for different commands. I think that a possible solution is to use two different timeouts for connections:
By timing out on data traffic at the connection level, you don't require the entire command to finish before the timeout, but just the server to start sending data, so it should not affect big resultsets. |
@luin I could have the time to work on this. I just would like to know what would be an acceptable design so that it can be merged. |
See also #61 which seems related. |
…er command #634, #61 BREAKING CHANGE: The maxRetriesPerRequest is set to 20 instead of null (same behavior as ioredis v3) by default. So when a redis server is down, pending commands won't wait forever until the connection become alive, instead, they only wait about 10s (depends on the retryStrategy option)
@elyobo Not really, this scenario does not trigger retries |
You're right, I misread the issue. |
|
The failure happens when the socket is connected but the server is not responding. Probably the OS will timeout at some point, but it appears that it will take 15 minutes by default on Linux to give up on that (https://pracucci.com/linux-tcp-rto-min-max-and-tcp-retries2.html) |
@soulrebel What about implementing a heartbeat strategy that sending |
I have a naive idea, what about simply using |
@luin I am not sure that would be enough. You can't send pings while waiting for responses to other commands (AFAIK) and you keep the connection busy and damage performance and you don't fix the issue if it happens during a non-ping command. @livoras That could work, although it is not that easy to clean up afterwards. I am intrigued by the possibility to setup a socket timeout during commands: in this case you should get a network error and the socket should be cleaned up automatically. In case of success, you just set a new timeout to 0 to disable. Since it's all handled by Node or the OS, based on TCP progress, this would also work well for cases where the response is coming, but slowly. I just do not know if this is a legit way to use sockets. |
I think part of the problem is that the RESP (Redis Protocol) is really basic and most of the commands are blocking. I don't believe that there is any kind of sequence/transaction/request ID used. So if the client timed out and issued another command, then the response to the previous command could first be received. I'm not sure how that would be handled - probably not well if there is no transaction ID. It might be safer that if a time out occurs to simply terminate the connection and reconnect. It would be cleaner if ioredis set a timeout on every command issued and if the timeout expired, return an error (if callback used) or reject (if promise used), terminate the connection and automatically attempt to reconnect. |
Added a pull request trying to address this issue: #658. What's your thoughts? |
Seems like the referenced PR was abandoned. What's the current status on handling unresponsive servers? |
Would be great to get this in. As it stands, when we terminated an instance running redis, ECS still tries to connect to it. We are using Redis Sentinel though. |
Still having the same issue. Is there any news? It's almost been a year since the last post here |
Adding timeout features to ioredis seems to be a difficult (or unwanted) task, although it is needed in some cases. I solved this by using |
isn't this already resolved by #1320 |
It appears that in some instances the connection to the server is established but the server will not respond (we are running AWS lambdas and elasticache, which is not really a good marriage).
In this scenario the client is stuck waiting for a response and queuing commands but never making progress or giving up and retrying the connection. Things just halt forever, since no new traffic is sent to detect that the connection is lost.
It can be reproduced by opening a port with netcat and just ignoring incoming traffic (
nc -l <port>
).Could a socket timeOut be set up to handle this while waiting for responses?
The text was updated successfully, but these errors were encountered: