Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exception handling for mutex or condition variable failures #268

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

meyerj
Copy link
Member

@meyerj meyerj commented Jul 21, 2018

RTT - or more specifically the RTT::os::Mutex, RTT::os::MutexRecursive and RTT::os::Condition implementations - were not checking for errors returned by the underlying OS system calls, which I think is a serious issue. As a consequence errors in using those primitives within RTT or in user code remain undetected which may lead to unpredictable and hard to debug behavior of Orocos applications.

This pull request introduces exception handling in the Mutex, MutexRecursive and Condition implementations. In order to avoid ambiguity if the return value of a OS abstraction method (rtos_*()) represents an error or not they now return negative values in the error case consistently and the absolute value holds the error code.

I added a unit test (as part of the core_test) for the single thread behavior of Mutex and MutexRecursive. The test currently fails for the xenomai target, which might reveal a problem in the implementation of rtos_mutex_unlock():

Entering test suite "OsTestSuite"
Entering test case "testMutex"
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(43): info: check m.trylock() passed
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(47): info: check lock.isSuccessful() passed
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(53): info: check lock2.isSuccessful() passed
unknown location(0): fatal error in "testMutex": std::runtime_error: Failed to unlock an instance of RTT::os::Mutex: Operation not permitted
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(53): last checkpoint
Leaving test case "testMutex"; testing time: 167mks
Entering test case "testMutexRecursive"
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(97): info: check m.trylock() passed
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(98): info: check m.trylock() passed
unknown location(0): fatal error in "testMutexRecursive": std::runtime_error: Failed to unlock an instance of RTT::os::MutexRecursive: Operation not permitted
/opt/orocos/xenomai2/src/orocos_toolchain/rtt/tests/os_test.cpp(98): last checkpoint
Leaving test case "testMutexRecursive"; testing time: 107mks
Leaving test suite "OsTestSuite"

Futhermore the IPC tests fail with this patch because the CORBA transport implementations tries to lock RTT mutexes from ORB threads, which is not allowed in Xenomai and triggers an exception now.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
…ations

There are several things that could go wrong when trying to initialize, lock,
unlock, signal or wait on pthread mutexes and condition variables.

RTT should check the return value of low-level OS abstraction functions and
throw an exception (or at least assert) in case of an unexpected error.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
Xenomai mutexes are always recursive.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
pthread mutexes in normal mode do not provide deadlock detection. It would be possible to the mutex
type to PTHREAD_MUTEX_ERRORCHECK, which would trigger the EDEADLK exception, but it comes with a
performance penalty.

Might be a candidate for a compile-time option.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
…y us

If a Mutex or MutexResursive instance is still locked by another thread upon destruction,
the handle becomes dangling.

Signed-off-by: Johannes Meyer <johannes@intermodalics.eu>
@francisco-miguel-almeida francisco-miguel-almeida added this to the 2.10 milestone Apr 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants