-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock between two receive threads when Netconf server crashes #200
Comments
Hi , Thanks for looking into the problem.I tested with latest master code. There is still deadlock between the notification thread and the send/receive thread because of two locks mut_ntf and mut_session. Notify thread holds mut_ntf and is waiting for mut_session lock.#0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 Send /receive thread holding mut_session and waiting for mut_ntf(gdb) bt Regards, |
Hi, there is now a separate branch called |
Hi , Tested the code from latest deadlockfix branch and the issue is resolved. Regards, |
ok, I'll wait for response in #199 and if the fix doesn't break it, I'll merge it into the master. |
Thanks.Will there be a new Release from the master, post the deadlock merge, any time sooner ? Regards, |
What do you mean by "Release"? |
Thanks.By "Release" I meant release branch like 0.9.0, 0.10.0 etc |
By that meaining, the master branch is actually |
Thank you for the information |
Hi,
I have three threads in my Netconf client program, Two threads are involved in sending/receiving Netconf requests. The third thread is a notification thread for receiving notifications.
When , Netconf server crashes, The Notification thread exits as expected (Because of fix for issue, Notification thread never exits on netconf server crash #193 ).
However ,one of the receive threads detects the server failure and attempts to send nc_session_close and it gets blocked at ncntf_dispatch_stop.
(gdb) bt
#0 __lll_lock_wait ()
#1 0x00007fddde4174d4 in _L_lock_952 ()
from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007fddde417336 in __GI___pthread_mutex_lock (mutex=0x12f4798)
#3 0x00007fddde8511e8 in ncntf_dispatch_stop () from /usr/lib/libnetconf.so.0
#4 0x00007fddde847598 in nc_session_close () from /usr/lib/libnetconf.so.0
#5 0x00007fddde84792e in nc_session_send.isra.4.part ()
from /usr/lib/libnetconf.so.0
#6 0x00007fddde84651b in nc_session_send_reply ()
from /usr/lib/libnetconf.so.0
#7 0x00007fddde846fb1 in nc_session_recv_reply ()
from /usr/lib/libnetconf.so.0
#8 0x00007fddde849cc3 in nc_session_send_recv () from /usr/lib/libnetconf.so.0
The other thread also gets blocked waiting for lock..
(gdb) bt
#0 __lll_lock_wait ()
#1 0x00007fddde4174d4 in _L_lock_952 ()
from /lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007fddde417336 in __GI___pthread_mutex_lock (mutex=0x12f46f8)
#3 0x00007fddde847eef in nc_session_send_rpc () from /usr/lib/libnetconf.so.0
#4 0x00007fddde849c2b in nc_session_send_recv () from /usr/lib/libnetconf.so.0
Based on code flow,instead of notification thread, if any of the other two threads happen to detect failure and initiate nc_session_close, all three threads would be got into deadlock as that thread would have fetched the lock but would have got blocked at ncntf_dispatch_stop.
I guess, we may have to set session->ntf_active to 0(May be in nc_session_close), to get away from this issue.
Can you please look into this problem and provide a solution ?
Regards,
Parameswaran
The text was updated successfully, but these errors were encountered: