Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hiredis client hang at redisBufferRead, with timeout option and keepalive option set #593

Closed
Gwan-He opened this issue May 7, 2018 · 5 comments

Comments

@Gwan-He
Copy link

Gwan-He commented May 7, 2018

Here is how the client program use hiredis to connect redis servers:
image
The client program callls the redisSetTimeout and redisEnableKeepAlive to set timeout option and activate tcp keepalive.

Here is the stack where the program hang:
image

My redis servers run in cluster mode by using twem proxy, when this problem occurred, there is one redis server shutdown due to hardware problem, and the MGet command right just sent to this broken redis server.
Unlike the other hang problems reported, my client program has set timeout and keepalive option when connected to redis servers, and although the client program runs in multi-thread task, but would use one exclusive hiredis client context for each thread. So would that be some new problems?

@Gwan-He Gwan-He closed this as completed May 7, 2018
@Gwan-He Gwan-He changed the title hiredis client blocked and hang at redisBufferRead, with timeout option set and keepalive hiredis client blocked and hang at redisBufferRead, with timeout option and keepalive option set May 7, 2018
@Gwan-He Gwan-He changed the title hiredis client blocked and hang at redisBufferRead, with timeout option and keepalive option set hiredis client hang at redisBufferRead, with timeout option and keepalive option set May 7, 2018
@Gwan-He Gwan-He reopened this May 7, 2018
@michael-grunder
Copy link
Collaborator

This might be a bit tough to test but I'll see if I can replicate the scenario you're describing. As for the threading, it should work as long as you're not attempting to share contexts across multiple threads.

@Gwan-He
Copy link
Author

Gwan-He commented May 15, 2018

Thanks for reply. I have read the hiredis code and I think the problem might caused by reconnect.
When my client program first connects to proxy, client calls redisSetTimeout and redisEnableKeepAlive api to set timeout and enable tcp keep alive. But after some errors occurred, the client would call redisReconnect api to reconnect to proxy, and the problem just happened here: redisReconnect api wouldn't set timeout and keep alive.
As described by the redisReconnect api comment, the reconnect api would rebuilt the connection with exactly the same config as old one, but seems this api doesn't implemented as the comment described. So is this a bug or the hiredis just implement in this way?

@hippo-dalaoshe
Copy link

have you solve this problem? I meet the same problem when I reset the IPSEC between my hiredis client and redis server, my client program hang until the IPSEC reconnect and the hiredis context->err is the error "REDIS_ERR_IO", I want to find this disconnect as soon as possible instead of when the IPSEC reconnect

@catterer
Copy link
Contributor

catterer commented Oct 1, 2021

I believe I reproduced the same issue. It seems that after call to redisReconnect you need to re-set timeout using redisSetTimeout, otherwise the client hangs in redisBufferRead. I reproduced this on current master (2d9d775) as well as older version (0.13.3).

Here is what I did:
I listen for a connection but never accept it. For example, you can do it with this script:

#!/usr/bin/perl -w

use strict;
use Socket;

my $port = 5198;
my $proto = getprotobyname('tcp');

socket(SOCKET, PF_INET, SOCK_STREAM, $proto) or die "Can't open socket $!\n";
setsockopt(SOCKET, SOL_SOCKET, SO_REUSEADDR, 1) or die "Can't set socket option to SO_REUSEADDR $!\n";

bind( SOCKET, pack( 'Sn4x8', AF_INET, $port, "\0\0\0\0" ))
       or die "Can't bind to port $port! \n";
listen(SOCKET, 5) or die "listen: $!";
sleep;

Patch to example.c to reprodice the issue:

--- a/examples/example.c
+++ b/examples/example.c
@@ -38,9 +38,24 @@ int main(int argc, char **argv) {
         }
         exit(1);
     }
+    redisSetTimeout(c, timeout);

     /* PING server */
     reply = redisCommand(c,"PING");
+    if (!reply) {
+      printf("ERROR\n");
+
+      redisReconnect(c);
+//      redisSetTimeout(c, timeout); // Uncommenting this line fixes the issue
+      reply = redisCommand(c,"PING");
+
+      // never reaches here
+      if (!reply) {
+        printf("ERROR\n");
+        exit(1);
+      }
+    }
+
     printf("PING: %s\n", reply->str);
     freeReplyObject(reply);

When you run it it only prints ERROR once and hangs:

> ./examples/hiredis-example localhost 5198
ERROR

backtrace:

(gdb) bt
#0  0x00007ffff7b0c84d in recv () from /lib64/libc.so.6
#1  0x000000000040b056 in redisNetRead (c=0x611010, buf=<optimized out>, bufcap=<optimized out>) at net.c:61
#2  0x00000000004037c3 in redisBufferRead (c=0x611010) at hiredis.c:951
#3  0x0000000000403b68 in redisGetReply (c=c@entry=0x611010, reply=reply@entry=0x7fffffffd5d8) at hiredis.c:1051
#4  0x0000000000403df9 in __redisBlockForReply (c=0x611010) at hiredis.c:1164
#5  redisvCommand (c=0x611010, format=<optimized out>, ap=ap@entry=0x7fffffffd5f8) at hiredis.c:1174
#6  0x0000000000403ea7 in redisCommand (c=c@entry=0x611010, format=format@entry=0x40c430 "PING") at hiredis.c:1180
#7  0x00000000004014c4 in main (argc=<optimized out>, argv=<optimized out>) at examples/example.c:50

(and yes, redisReconnect returns 0).

@michael-grunder
Copy link
Collaborator

Should be resolved via #1093

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants