Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in win_iocp_io_context::sutdown() when owned by thread_pool #431

Closed
glebov-andrey opened this issue Oct 11, 2019 · 3 comments
Closed

Comments

@glebov-andrey
Copy link

glebov-andrey commented Oct 11, 2019

Hi.

There appears to be a bug when shutting down a win_iocp_io_context that is owned by a thread_pool, causing a deadlock.

Here is the reproducer (the std::optionals are for explicit destruction and are totally optional):

#include <optional>
#include <asio/steady_timer.hpp>
#include <asio/thread_pool.hpp>

int main() {
    auto thread_pool = std::optional<asio::thread_pool>{std::in_place, 1};
    auto timer = std::optional<asio::steady_timer>{std::in_place, *thread_pool}; // creates a win_iocp_io_context
    thread_pool->join();
    timer.reset();
    thread_pool.reset(); // DEADLOCK!
}

When win_iocp_io_context::shutdown() calls thread_->join()

it deadlocks because the thread is blocked in win_iocp_io_context::do_one on a call to GetQueuedCompletionStatus
BOOL ok = ::GetQueuedCompletionStatus(iocp_.handle,
&bytes_transferred, &completion_key, &overlapped,
msec < gqcs_timeout_ ? msec : gqcs_timeout_);

The only way to cause GetQueuedCompletionStatus to return from the interface seems to be win_iocp_io_context::stop(). If I call it directly then that fixes the deadlock:

// ...
asio::use_service<asio::detail::win_iocp_io_context>(*thread_pool).stop();
thread_pool.reset(); // FINE

The same bug arises when creating anything else that uses a win_iocp_io_context, e.g. a socket.
Maybe, there should be a call to stop() (or something equivalent) before joining the thread in shutdown()?

The is reproducible in at least 1.13.0 and 1.14.1 (0a52abc).
The OS is Windows 10 1903, compiled with VS 2019 and clang-cl 9.0.

@helmesjo
Copy link

This is sort of a showstopper. Anyone looking into it?

@ihsandemir
Copy link

I am having the same problem:

Reproducer:

// register a timer
// cancel the timer
pool.stop();
pool.join();
// the pool does out of scope and destructed

The destructor hangs at:

clientTest_STATIC_32.exe!boost::asio::detail::win_thread::join() Line 46
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_thread.ipp(46)
clientTest_STATIC_32.exe!boost::asio::detail::win_iocp_io_context::shutdown() Line 136
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_iocp_io_context.ipp(136)
clientTest_STATIC_32.exe!boost::asio::detail::service_registry::shutdown_services() Line 44
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\service_registry.ipp(44)
clientTest_STATIC_32.exe!boost::asio::execution_context::shutdown() Line 42
	at C:\Boost\include\boost-1_72\boost\asio\impl\execution_context.ipp(42)
clientTest_STATIC_32.exe!boost::asio::execution_context::~execution_context() Line 35
	at C:\Boost\include\boost-1_72\boost\asio\impl\execution_context.ipp(35)
clientTest_STATIC_32.exe!boost::asio::thread_pool::~thread_pool() Line 61
	at C:\Boost\include\boost-1_72\boost\asio\impl\thread_pool.ipp(61)

The worker thread:
clientTest_STATIC_32.exe!boost::asio::detail::win_iocp_io_context::do_one(unsigned long msec, boost::system::error_code & ec) Line 420
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_iocp_io_context.ipp(420)
clientTest_STATIC_32.exe!boost::asio::detail::win_iocp_io_context::run(boost::system::error_code & ec) Line 202
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_iocp_io_context.ipp(202)
clientTest_STATIC_32.exe!boost::asio::detail::win_iocp_io_context::thread_function::operator()() Line 48
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_iocp_io_context.ipp(48)
clientTest_STATIC_32.exe!boost::asio::detail::win_thread::func<boost::asio::detail::win_iocp_io_context::thread_function>::run() Line 123
	at C:\Boost\include\boost-1_72\boost\asio\detail\win_thread.hpp(123)
clientTest_STATIC_32.exe!boost::asio::detail::win_thread_function(void * arg) Line 119
	at C:\Boost\include\boost-1_72\boost\asio\detail\impl\win_thread.ipp(119)

ihsandemir added a commit to ihsandemir/hazelcast-cpp-client that referenced this issue Apr 17, 2020
ihsandemir added a commit to ihsandemir/hazelcast-cpp-client that referenced this issue Apr 17, 2020
Corrected the time string print utility.

Replaced some of the time usages with std::chrono related ones.

Made the timer cancel operations non-exception throwing.

Replaced the system_clock usages with steady_clock usages.

Warning fixes for windows for unused exceptions.
ihsandemir added a commit to ihsandemir/hazelcast-cpp-client that referenced this issue Apr 17, 2020
Corrected the time string print utility.

Replaced some of the time usages with std::chrono related ones.

Made the timer cancel operations non-exception throwing.

Replaced the system_clock usages with steady_clock usages.

Warning fixes for windows for unused exceptions.
ihsandemir added a commit to hazelcast/hazelcast-cpp-client that referenced this issue Apr 20, 2020
…romise, std::nested_exception, connection local invocation and event handler maps (#584)

* Changed the event processing executor usage to use asio::thread_pool.

Changed the ResponseThread to ResponseProcessor and it now processes the response message directly by default and will use a thread_pool if the newly introduced parameter `hazelcast.client.response.executor.thread.count` is set to a positive value.

Removed custom Thread, Future implementations and used std::thread, boost::asio::thread_pool and boost::future/boost::promise.

boost::future supports continuation (`.then` support).

Changed the invocation and event handler map design. Using an `unordered_map` which which only being accessed from the same `io` thread for the connection. This eliminated the need for using sync maps for invocations and events. If any other thread needs to access the invocations, it passes this request to io thread. This is only needed when closing the socket for notifying the outstanding invocations. If an invocation can not be queued into the connection (i.e. no connection exists for the invocation and it is being retried), then it is registered in a sync map at the invocation service only for the first retry and cleared from the service when a response is finally received. This ensures that the user thread is always being notified.

Exception architecture is compeletely changed so that we use std exception_ptr capabilities. Also integrated std::nested exceptions for exceptions which has a cause exception. The cause is obtained by using `std::rethrow_if_nested` method.

The project now depends on boost::thread for boost::future. Changed cmake so that the boost installation is user's responsibility. Also using the boost Thread library version latest which is version 5.

* Linux build fixes.

* Fixed the examples per changes in the APIs.

* linux test fix.

* Removed custom implementations of `CountDownLatch, Mutex, LockGuard, ConditionVariable`. Replaced them with standard and boost based implementations.

Changed required boost version to 1.72.

* Fix to `IExecutorService`.

* Fix to code examples.

Removed unneeded include for windows (`WinSock2.h`) which causes error in compilation.

* Test fix (count down latch count was incorrect).

Windows Windsock2.h usage for easylogging to work with boost.

* test fixes.

* Eliminated the extra invocations registration at the invocation service.

* Removed all invocation and even handler registrations in listener and invocation service.

Changed the CallIdSequence to use std::array of atomic integers.

* Put `boost::` namespace to all implicit boost usages which was needed for compiling successfully at Windows (fails due to `ambiguous symbol` error which can be both from std or boost namespaces).

* Fixed the `boost::latch` usages in the examples.

* test fix for wait time.

* Removed unneeded AtomicArray class.

* fix for issue chriskohlhoff/asio#431

Corrected the time string print utility.

Replaced some of the time usages with std::chrono related ones.

Made the timer cancel operations non-exception throwing.

Replaced the system_clock usages with steady_clock usages.

Warning fixes for windows for unused exceptions.

* Added `hz_thread_pool` for graceful shutdown of thread pools.

Some fixes so that the exceptions do not abort the io threads. Also, destroying the timer and pools upon shutdown so that no associated resource is left (which may happen for windows).
@chriskohlhoff
Copy link
Owner

Addressed in asio version 1.16.1.

ihsandemir added a commit to ihsandemir/hazelcast-cpp-client that referenced this issue Jan 8, 2021
…ual client destruction. Failing to do this causes illegal memory access during destruction of stands in the listener_service_impl.

Removed the work around of bug chriskohlhoff/asio#431.

Added correct handling of the invocation retry timer. If it is not assigned to a timer variable in the invocation then it will actually be destroyed immediately, which will cause the timer cancelled immediately on destruction but this is not what we want.
ihsandemir added a commit to hazelcast/hazelcast-cpp-client that referenced this issue Jan 8, 2021
…or issue chriskohlhoff/asio#431 (#751)

* Removed destruction of the thread_pool on shutdown and left it to actual client destruction. Failing to do this causes illegal memory access during destruction of stands in the listener_service_impl.

Removed the work around of bug chriskohlhoff/asio#431.

Added correct handling of the invocation retry timer. If it is not assigned to a timer variable in the invocation then it will actually be destroyed immediately, which will cause the timer cancelled immediately on destruction but this is not what we want.

* Double check client shutdown on connection opening.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants