Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transport segfault on shutdown #2

Closed
osrf-migration opened this issue Oct 8, 2018 · 7 comments
Closed

Transport segfault on shutdown #2

osrf-migration opened this issue Oct 8, 2018 · 7 comments
Labels
bug Something isn't working

Comments

@osrf-migration
Copy link

Original report (archived issue) by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


I observed that at times ign-gazebo crashes on shutdown. It doesn't happen every time. The backtrace points to ign-transport's WorkerPool:

#0  0x00007ffff7bb7d34 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555555be1de0) at /usr/include/c++/8/ext/atomicity.h:69
#1  0x00007ffff4c06546 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffc406e3f8, __in_chrg=<optimized out>)
    at /usr/include/c++/8/bits/shared_ptr_base.h:1151
#2  std::__shared_ptr<ignition::transport::v6::ISubscriptionHandler, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffc406e3f0, __in_chrg=<optimized out>)
    at /usr/include/c++/8/bits/shared_ptr_base.h:1151
#3  std::shared_ptr<ignition::transport::v6::ISubscriptionHandler>::~shared_ptr (
    this=0x7fffc406e3f0, __in_chrg=<optimized out>)
    at /usr/include/c++/8/bits/shared_ptr.h:103
#4  ignition::transport::v6::Node::Publisher::<lambda()>::~<lambda> (
    this=0x7fffc406e3f0, __in_chrg=<optimized out>)
    at /home/developer/ign-transport/src/Node.cc:337
#5  std::_Function_base::_Base_manager<ignition::transport::v6::Node::Publisher::Publish(const ProtoMsg&)::<lambda()> >::_M_destroy (__victim=...)
    at /usr/include/c++/8/bits/std_function.h:188
#6  std::_Function_base::_Base_manager<ignition::transport::v6::Node::Publisher::Publish(const ProtoMsg&)::<lambda()> >::_M_manager(std::_Any_data &, const std::_Any_data &, std::_Manager_operation) (__dest=..., __source=..., __op=<optimized out>)
    at /usr/include/c++/8/bits/std_function.h:212
#7  0x00007ffff4c3660a in std::_Function_base::~_Function_base (
    this=0x7fffe05b4970, __in_chrg=<optimized out>)
    at /usr/include/c++/8/bits/std_function.h:257
#8  std::function<void ()>::~function() (this=0x7fffe05b4970, 
    __in_chrg=<optimized out>) at /usr/include/c++/8/bits/std_function.h:370
#9  std::function<void ()>::operator=(std::function<void ()>&&) (__x=..., 
    this=0x7fffe05b4990) at /usr/include/c++/8/bits/std_function.h:481
#10 ignition::transport::v6::WorkOrder::operator= (this=0x7fffe05b4990)
    at /home/developer/ign-transport/src/WorkerPool.cc:35
#11 ignition::transport::v6::WorkerPoolPrivate::Worker() ()
    at /home/developer/ign-transport/src/WorkerPool.cc:98
#12 0x00007ffff551a733 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#13 0x00007ffff3fc16db in start_thread (arg=0x7fffe05b5700) at pthread_create.c:463
#14 0x00007ffff4f7488f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


I can only make this happen when the GUI is running. Do you see the same?

@osrf-migration
Copy link
Author

Original comment by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


Yes, I can't reproduce it without the GUI. I also tried running ign topic -e -t /world/default/stats on a separate terminal as I shutdown ign-gazebo and couldn't get a transport segfault.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


  • set assignee_account_id to "557058:4ded1ddf-947e-4154-bbd1-3dba24f1bdbd"
  • set assignee to "caguero (Bitbucket: caguero, GitHub: caguero)"

@caguero , do you think this could happen if a subscription callback is deleted while an incoming message is being processed by ign-transport?

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


So far, I found that the segfaults happens with the WorldControl and WorldStats plugins but not with the Scene3D plugin.

@osrf-migration
Copy link
Author

Original comment by Carlos Agüero (Bitbucket: caguero, GitHub: caguero).


I spent some time looking into this and the problem seems to be in this lambda in Node.cc:336

this->dataPtr->shared->dataPtr->workerPool.AddWork(
              [localHandler, msgCopy, info] ()
              {
                try
                {
                  localHandler->RunLocalCallback(*msgCopy, *info);
                }
                catch (...)
                {
                  std::cerr << "Exception occurred in a local callback "
                    << "on topic [" << info->Topic() << "] with message ["
                    << msgCopy->DebugString() << "]" << std::endl;
                }
              });

In particular, the localHandler shared pointer. My hypothesis is that during the NodeShared destructor, the thread pool still have pending jobs containing this lambda that cause the segfault.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


Potential solution: https://osrf-migration.github.io/ignition-gh-pages/#!/ignitionrobotics/ign-transport/pull-requests/352

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


  • changed state from "new" to "resolved"

Resolved for now in the afore mentioned pull request.

@osrf-migration osrf-migration added major bug Something isn't working labels Apr 15, 2020
mabelzhang pushed a commit that referenced this issue Dec 12, 2020
ahcorde added a commit that referenced this issue Jul 2, 2021
…#874)

* fix conditional, extract common code

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

* add test for worlds that import obj models, checks for textures

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

* cpplint

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

* cpplint #2

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

Update test/integration/collada_world_exporter.cc

Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

* fix path concatencation

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

* cpplint #3

Signed-off-by: ddengster <ed.fan@osrfoundation.org>

Co-authored-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
Co-authored-by: Louise Poubel <louise@openrobotics.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant