Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] AckGroupingTrackerEnabled may cause segmentation fault #8914

Closed
BewareMyPower opened this issue Dec 11, 2020 · 1 comment
Closed

[C++] AckGroupingTrackerEnabled may cause segmentation fault #8914

BewareMyPower opened this issue Dec 11, 2020 · 1 comment
Assignees
Labels
lifecycle/stale type/bug The PR fixed a bug or issue reported a bug

Comments

@BewareMyPower
Copy link
Contributor

Describe the bug
Sometimes the program may crashed at AckGroupingTrackerEnabled#scheduleTimer. Though #8519 tries to solve the problem by extending the lifetime of AckGroupingTrackerEnabled so that the callback won't access the outdated this. However, the segmentation fault still happens.

A typical stack trace is:

 #6 <signal handler called>
 #7 0x00007f5aad920b60 in ?? ()
 #8 0x00007f6e9ee7d1bb in boost::asio::detail::wait_handler<pulsar::AckGroupingTrackerEnabled::scheduleTimer()::{lambda(boost::system::error_code const&)#1}>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) ()
 from /opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
 #9 0x00007f6e9edd78d3 in boost::asio::detail::scheduler::run(boost::system::error_code&) ()
 from /opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
 #10 0x00007f6e9edd4aa6 in pulsar::ExecutorService::startWorker(std::shared_ptr<boost::asio::io_context>) ()
 from /opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
 #11 0x00007f6e9edd9c82 in std::thread::_Impl<std::_Bind_simple<std::_Bind<std::_Mem_fn<void (pulsar::ExecutorService::)(std::shared_ptr<boost::asio::io_context>)> (pulsar::ExecutorService, std::shared_ptr<boost::asio::io_context>)> ()> >::_M_run() ()
 from /opt/vertica/verticadb/v_verticadb_node0003_catalog/Libraries/0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c/PulsarSourceLib_0213a49612da8c2ad8b19aa3bd77ddec00a000000002090c.so
 #12 0x00007f6fcb5d2070 in ?? () from /lib64/libstdc++.so.6
 #13 0x00007f6fcb006dd5 in start_thread () from /lib64/libpthread.so.0
 #14 0x00007f6fca923ead in clone () from /lib64/libc.so.6

To Reproduce
It cannot be reproduced easily. The running environment is that a Client is long lived, and many Readers are periodly created and used to read some messages.

Expected behavior
The segmentation fault should not happen.

Additional context
A solution that may work is refactoring the timer design. Currently, the deadline timer is recreated each time in the callback. And there's no state check like PartitionedConsumerImpl::partitionsUpdateTimer_:

void PartitionedConsumerImpl::runPartitionUpdateTask() {
    partitionsUpdateTimer_->expires_from_now(partitionsUpdateInterval_);
    partitionsUpdateTimer_->async_wait(
        std::bind(&PartitionedConsumerImpl::getPartitionMetadata, shared_from_this()));
}

void PartitionedConsumerImpl::getPartitionMetadata() {
    using namespace std::placeholders;
    lookupServicePtr_->getPartitionMetadataAsync(topicName_)
        .addListener(std::bind(&PartitionedConsumerImpl::handleGetPartitions, shared_from_this(), _1, _2));
}

void PartitionedConsumerImpl::handleGetPartitions(Result result,
                                                  const LookupDataResultPtr& lookupDataResult) {
    Lock stateLock(mutex_);
    if (state_ != Ready) {
        // NOTE: when consumer is not ready, the runPartitionUpdateTask won't be scheduled
        return;
    }
    /* do the real work... */
    runPartitionUpdateTask();
}

However, we still need to give a detail explanation for the stack trace that's mentioned before.

@tisonkun
Copy link
Member

tisonkun commented Dec 9, 2022

Closed as stale. The development of the C++ client has been permanently moved to http://github.com/apache/pulsar-client-cpp. Please open an issue there if it's still relevant.

@tisonkun tisonkun closed this as not planned Won't fix, can't repro, duplicate, stale Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

No branches or pull requests

3 participants