Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redisCommand not blocking when expected #507

Closed
Southclaws opened this issue Feb 14, 2017 · 8 comments
Closed

redisCommand not blocking when expected #507

Southclaws opened this issue Feb 14, 2017 · 8 comments
Assignees

Comments

@Southclaws
Copy link

Southclaws commented Feb 14, 2017

I have a thread (thread owns its own redisContext*, can't use event libs because reasons) and that thread is in a while loop with a redisCommand calling BLPOP.

Now I expected this thread to just wait there for a LPUSH, grab the data, do my thing and go back to waiting. It pops one value then context->errstr complains about a "bad file descriptor".

So I'm guessing somehow the context has lost the fd at some point while blocking but I can't figure out if this is a bug with hiredis or something I'm missing. I've looked at pretty much every example available. Does the context need a keepalive mechanism or something along those lines? I had a quick look at #168 but my connected_clients doesn't appear to be going down so it seems my threaded context is still connected...

Here's the wait loop code https://github.com/Southclaws/samp-redis/blob/master/src/impl.cpp#L432-L451

Southclaws added a commit to Southclaws/pawn-redis that referenced this issue Feb 14, 2017
@badboy
Copy link
Contributor

badboy commented Feb 14, 2017

Did you try with enabling keepalive? It might very well that the server closes the connection.
You should also check the server's log and the server's tcp-keepalive setting.

@Southclaws
Copy link
Author

It's the default tcp-keepalive 300 I should add that this problem I'm having happens within seconds of connecting (during unit tests). The redisCommand call blocks at first, pops the first element and then on the next iteration, returns a NULL reply.

@badboy
Copy link
Contributor

badboy commented Feb 14, 2017

That sounds weird indeed. I don't have time this week to look into it, but I put it on my list for the following week

@badboy badboy self-assigned this Feb 14, 2017
@Southclaws
Copy link
Author

Southclaws commented Feb 14, 2017

Alright, thanks! I'm going to do some investigation too - I'm going to implement an auto-reconnect into my application anyway so with some more logging and pushing it onto my test server I should have some more information on why this is happening. Strangely I can't reproduce it consistently either, which is annoying...

@michael-grunder
Copy link
Collaborator

Hi,

I haven't done any real C++ work in years, but this is almost certainly a memory corruption bug due to threading. I wrote a very simple C program to run BLPOP in a worker thread and hiredis is just fine with it:
https://gist.github.com/michael-grunder/edb0a438a6e923c3cf375038be73c7c9

If you place a continue statement above this line here, does the bug go away?:
https://github.com/Southclaws/samp-redis/blob/master/src/impl.cpp#L460

The fact that it works sometimes, and other times fails makes me almost certain that this is a threading issue.

Cheers!
Mike

@badboy
Copy link
Contributor

badboy commented Feb 14, 2017

While I am always the first one pointing out issues in multi-threaded programs, I don't see a problem with the current usage.
The context is created and used in a single function and thus not across threads. processMessages iterates over the returned elements, processMessage copies out the string.
None of the hiredis values is moved across thread boundaries as far as I can see.

It might still be possible for further memory corruption, but a closed socket would probably cause similar problems.

@Southclaws
Copy link
Author

Southclaws commented Feb 19, 2017

Just in case you were still thinking about this issue, I've solved it now. You'll be pleased to know that Hiredis is fine! It was some memory corruption on the host app caused by some dodgy SQLite library (the main reason for my implementing Hiredis was to get the SQL code out of the host and into a Go app to do it concurrently via a message queue anyway).

Thanks for the replies on the topic anyway, apologies for wasting an issue :)

@badboy
Copy link
Contributor

badboy commented Feb 19, 2017

:) Thanks for reporting back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants