-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception boost::lock_error thrown from shutdown method #838
Comments
Is there a sequence of steps which demonstrate the issue reliably? And what ROS version are you having the problem on? The topic manager is being shut down by this location: ros_comm/clients/roscpp/src/libros/init.cpp Line 596 in 8741260
|
Nominally we're on Indigo, but with ros_comm 1.12.2 (kinetic) and an older rqt (from prior to the change that broke rqt_rviz). Our crash reporter shows these primarily coming from rqt instances, but we also see them from gazebo (running various ROS-connected plugins), and even from rosserial_server (eq, when embedded in a process with other ROS code). Here is another rqt example:
What kind of stack trace would be more helpful to see? I might be able to get a reproduction together for this, but I suspect it would be as simple as instantiating NodeHandle instances on a bunch of different threads and then exiting the process. EDIT: Okay, it's not quite that easy. I've built the proposed trivial example and it seems to exit gracefully each time, even with many threads. I shall continue to experiment and see if I can get a MWE together. Looks like this may be related to #318, where a user reports similar issues with lock_error instances thrown during cleanup. |
Thanks for looking into it. If we have a case which reproducible crashes (even if it is only with a certain probability) I am happy to help. |
Hi @dirk-thomas and @mikepurvis I've run into a similar boost::lock error as reported here and in #318. I encountered this error when moving from Indigo to Kinetic and have done some debugging on it. My minimal reproducible example can be cloned from here. I have a singleton class that has static storage but stores a reference to a node handle. In that singleton class there is a Subscriber and/or ServiceServer member that, when advertised, will cause the exception to be thrown at exit. I believe that the reason why this happens is that the *Manager singletons (I've looked at Topic and Service but likely all are the same) are held as static In my case the user declared Singleton will be then be destructed as it has static storage causing the I don't know what the best fix for this is. I think that the problem could be avoided by having every node handle have a private |
Wow, thanks for digging into this! Your proposed fix sounds reasonable to me. |
Is there a workaround for this? I am coming across this issue when |
The solution, which @scott-eddy proposed, works, but introduces an ABI-break. The in #1630 avoids this. |
There is a similar problem with a Timer in topic_tool/relay. This is the reason, why the relay_stealth.test crashes with an exception. |
We see crashes like this as multi-threaded processes like
rqt_gui
shut down:This is the code:
It looks like the expected behaviour is to block until the mutex can be acquired, so this is maybe an issue with one thread tearing down the TopicManager and then other finding a mutex which is no longer valid? Maybe there's a better solution here than just wrapping a try block around the whole thing?
The text was updated successfully, but these errors were encountered: