Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

fix: fix the usage of service_engine::get_all_nodes() #730

Merged
merged 2 commits into from
Jan 22, 2021

Conversation

zhangyifan27
Copy link
Contributor

@zhangyifan27 zhangyifan27 commented Jan 18, 2021

In latest tests I found that we try to destroy service_node repeatly when exiting pegasus process.
coredump stack:(see frame 54 and 26)

#0  0x00007f48ef7221d7 in raise () from /lib64/libc.so.6
#1  0x00007f48ef7238c8 in abort () from /lib64/libc.so.6
#2  0x00007f48f1588ceb in tcmalloc::Log (mode=mode@entry=tcmalloc::kCrash, filename=filename@entry=0x7f48f15a0336 "src/tcmalloc.cc", line=line@entry=332, a=…, b=…, c=…, d=…)
    at src/internal_logging.cc:118
#3  0x00007f48f157a797 in (anonymous namespace)::InvalidFree (ptr=<optimized out>) at src/tcmalloc.cc:332
#4  0x00007f48f42f4674 in deallocate<boost::asio::detail::thread_info_base::default_tag> (size=<optimized out>, pointer=<optimized out>, this_thread=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/thread_info_base.hpp:108
#5  deallocate (size=<optimized out>, pointer=<optimized out>, this_thread=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/thread_info_base.hpp:63
#6  asio_handler_deallocate (size=120, pointer=0x26f8d80) at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/impl/handler_alloc_hook.ipp:42
#7  deallocate<dsn::tools::asio_udp_provider::do_receive()::<lambda(const boost::system::error_code&, std::size_t)> > (h=…, s=120, p=0x26f8d80)
    at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/handler_alloc_helpers.hpp:50
#8  deallocate (this=<synthetic pointer>, n=1, p=0x26f8d80) at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/handler_alloc_helpers.hpp:91
#9  reset (this=0x7f486edf1600, this=0x7f486edf1600) at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/reactive_socket_recvfrom_op.hpp:84
#10 boost::asio::detail::reactive_socket_recvfrom_op<boost::asio::mutable_buffers_1, boost::asio::ip::basic_endpoint<boost::asio::ip::udp>, dsn::tools::asio_udp_provider::do_receive()::<lambda(const boost::system::error_code&, std::size_t)> >::do_complete(void *, boost::asio::detail::operation *, const boost::system::error_code &, std::size_t) (owner=0x2ecdb00, 
    base=0x26f8d80) at /home/zhangyifan8/work/pegasus/rdsn/thirdparty/output/include/boost/asio/detail/reactive_socket_recvfrom_op.hpp:118
#11 0x00007f48f4229afe in operator() (this=<optimized out>, __ptr=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:76
#12 ~unique_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:236
#13 _Destroy<std::unique_ptr<dsn::network> > (__pointer=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:93
#14 __destroy<std::unique_ptr<dsn::network>> (__last=<optimized out>, __first=0x2ecdb80) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:103
#15 _Destroy<std::unique_ptr<dsn::network>> (__last=<optimized out>, __first=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:126
#16 _Destroy<std::unique_ptr<dsn::network>, std::unique_ptr<dsn::network> > (__last=0x0, __first=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:151
#17 ~vector (this=0x2ecda00, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_vector.h:424
#18 _Destroy<std::vector<std::unique_ptr<dsn::network> > > (__pointer=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:93
#19 __destroy<std::vector<std::unique_ptr<dsn::network> >> (__last=<optimized out>, __first=0x2ecda00) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:103
#20 _Destroy<std::vector<std::unique_ptr<dsn::network> >> (__last=<optimized out>, __first=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:126
#21 _Destroy<std::vector<std::unique_ptr<dsn::network> >, std::vector<std::unique_ptr<dsn::network> > > (__last=0x2ecda78, __first=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_construct.h:151
#22 ~vector (this=0x33cc008, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_vector.h:424
#23 ~rpc_engine (this=0x33cc000, __in_chrg=<optimized out>) at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/rpc/rpc_engine.h:130
#24 operator() (this=<optimized out>, __ptr=0x33cc000) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:76
#25 ~unique_ptr (this=0x2c368d0, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:236
#26 dsn::service_node::~service_node (this=0x2c366d0, __in_chrg=<optimized out>) at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/service_engine.h:61
#27 0x00007f48f4223b9a in _M_release (this=0x2c366c0) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:150
#28 ~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:659
#29 ~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:925
#30 ~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr.h:93
#31 ~pair (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_pair.h:96
#32 destroy<std::pair<int const, std::shared_ptr<dsn::service_node> > > (this=<optimized out>, __p=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/ext/new_allocator.h:124
#33 destroy<std::pair<int const, std::shared_ptr<dsn::service_node> > > (__a=…, __p=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/alloc_traits.h:542
#34 _M_destroy_node (this=0x7f486edf1770, __p=0x335b200) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:553
#35 _M_drop_node (this=0x7f486edf1770, __p=0x335b200) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:561
#36 std::_Rb_tree<int, std::pair<int const, std::shared_ptr<dsn::service_node> >, std::_Select1st<std::pair<int const, std::shared_ptr<dsn::service_node> > >, std::less<int>, std::allocator<std::pair<int const, std::shared_ptr<dsn::service_node> > > >::_M_erase (this=this@entry=0x7f486edf1770, __x=0x335b200)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:1614
#37 0x00007f48f421f99f in ~_Rb_tree (this=0x7f486edf1770, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:858
#38 ~map (this=0x7f486edf1770, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_map.h:96
#39 dsn_mimic_app (app_role=0x7f48f43e65b7 "mimic", index=1) at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/service_api_c.cpp:211
#40 0x00007f48f42747df in check_tls_dsn () at /home/zhangyifan8/work/pegasus/rdsn/include/dsn/tool-api/task.h:540
#41 get_current_task () at /home/zhangyifan8/work/pegasus/rdsn/include/dsn/tool-api/task.h:546
#42 dsn::task::cancel (this=0x3588ba8, wait_until_finished=wait_until_finished@entry=true, finished=finished@entry=0x0)
    at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/task/task.cpp:313
#43 0x00007f48f428d300 in dsn::task_tracker::cancel_outstanding_tasks (this=this@entry=0x2d28980) at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/task/task_tracker.cpp:144
#44 0x00007f48f4069070 in dsn::replication::replica_stub::close (this=this@entry=0x2d28300) at /home/zhangyifan8/work/pegasus/rdsn/src/replica/replica_stub.cpp:2410
_tracker.cancel_outstanding_tasks();
#45 0x00007f48f4071c69 in dsn::replication::replica_stub::~replica_stub (this=0x2d28300, __in_chrg=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/src/replica/replica_stub.cpp:111
#46 0x00007f48f4072211 in dsn::replication::replica_stub::~replica_stub (this=0x2d28300, __in_chrg=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/src/replica/replica_stub.cpp:111
#47 0x00007f48f40bb1e3 in release_ref (this=<optimized out>) at /home/zhangyifan8/work/pegasus/rdsn/include/dsn/utility/autoref_ptr.h:84
#48 ~ref_ptr (this=0x3399d90, __in_chrg=<optimized out>) at /home/zhangyifan8/work/pegasus/rdsn/include/dsn/utility/autoref_ptr.h:139
#49 dsn::replication::replication_service_app::~replication_service_app (this=0x3399d70, __in_chrg=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/src/replica/replication_service_app.cpp:51
#50 0x00000000005046c8 in ~pegasus_replication_service_app (this=0x3399d70, __in_chrg=<optimized out>) at /home/zhangyifan8/work/pegasus/src/server/pegasus_service_app.h:31
#51 pegasus::server::pegasus_replication_service_app::~pegasus_replication_service_app (this=0x3399d70, __in_chrg=<optimized out>)
    at /home/zhangyifan8/work/pegasus/src/server/pegasus_service_app.h:31
#52 0x00007f48f4229d0f in operator() (this=<optimized out>, __ptr=0x3399d70) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:76
#53 ~unique_ptr (this=0x2c36758, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/unique_ptr.h:236
#54 dsn::service_node::~service_node (this=0x2c366d0, __in_chrg=<optimized out>) at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/service_engine.h:61
#55 0x00007f48f4223b9a in _M_release (this=0x2c366c0) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:150
#56 ~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:659
#57 ~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr_base.h:925
#58 ~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/shared_ptr.h:93
#59 ~pair (this=<optimized out>, __in_chrg=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_pair.h:96
#60 destroy<std::pair<int const, std::shared_ptr<dsn::service_node> > > (this=<optimized out>, __p=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/ext/new_allocator.h:124
#61 destroy<std::pair<int const, std::shared_ptr<dsn::service_node> > > (__a=…, __p=<optimized out>) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/alloc_traits.h:542
#62 _M_destroy_node (this=0x7f48f46c3d78 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance+536>, __p=0x3460fc0)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:553
#63 _M_drop_node (this=0x7f48f46c3d78 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance+536>, __p=0x3460fc0)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:561
#64 std::_Rb_tree<int, std::pair<int const, std::shared_ptr<dsn::service_node> >, std::_Select1st<std::pair<int const, std::shared_ptr<dsn::service_node> > >, std::less<int>, std::allocator<std::pair<int const, std::shared_ptr<dsn::service_node> > > >::_M_erase (this=this@entry=0x7f48f46c3d78 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance+536>, 
    __x=0x3460fc0) at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:1614
#65 0x00007f48f4227e07 in ~_Rb_tree (this=0x7f48f46c3d78 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance+536>, __in_chrg=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_tree.h:858
#66 ~map (this=0x7f48f46c3d78 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance+536>, __in_chrg=<optimized out>)
    at /home/zhangyifan8/local/gcc-5.4.0/include/c++/5.4.0/bits/stl_map.h:96
#67 dsn::service_engine::~service_engine (this=0x7f48f46c3b60 <dsn::utils::singleton<dsn::service_engine>::instance()::_instance>, __in_chrg=<optimized out>)
    at /home/zhangyifan8/work/pegasus/rdsn/src/runtime/service_engine.h:108
#68 0x00007f48ef725a49 in __run_exit_handlers () from /lib64/libc.so.6
#69 0x00007f48ef725a95 in exit () from /lib64/libc.so.6
#70 0x00007f48f4d04747 in vm_direct_exit(int) () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#71 0x00007f48f50d0ab7 in VM_Operation::evaluate() () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#72 0x00007f48f50cf4b8 in VMThread::evaluate_operation(VM_Operation) () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#73 0x00007f48f50cf919 in VMThread::loop() () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#74 0x00007f48f50cfd62 in VMThread::run() () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#75 0x00007f48f4f2f422 in java_start(Thread) () from /opt/soft/jdk/jre/lib/amd64/server/libjvm.so
#76 0x00007f48f134edc5 in start_thread () from /lib64/libpthread.so.0
#77 0x00007f48ef7e473d in clone () from /lib64/libc.so.6`

service_engine::get_all_nodes() returns a reference of map, it should not be assigned to a temporary variable.

@neverchanje neverchanje merged commit cddaec7 into XiaoMi:master Jan 22, 2021
zhangyifan27 added a commit to zhangyifan27/rdsn that referenced this pull request Jan 26, 2021
zhangyifan27 added a commit to zhangyifan27/rdsn that referenced this pull request Jan 26, 2021
@neverchanje neverchanje added the type/sanitize Fixes on errors reported by sanitizers. label Feb 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/sanitize Fixes on errors reported by sanitizers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants