Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on subsequent API calls with permission filters #6874

Closed
jottekop opened this issue Jan 4, 2019 · 53 comments
Closed

Crash on subsequent API calls with permission filters #6874

jottekop opened this issue Jan 4, 2019 · 53 comments
Assignees
Labels
area/api REST API blocker Blocks a release or needs immediate attention core/crash Shouldn't happen, requires attention ref/IP ref/NC
Milestone

Comments

@jottekop
Copy link

jottekop commented Jan 4, 2019

Expected Behavior

We expect that icinga2 keeps running when we do 40ish api requests a second.

Current Behavior

Currently we run icinga2 client without problems. But as soon as we start doing GET requests on the master api to get hosts and services status. with 9 different users on 2 different endpoints. so 18 connections. the service fails with a segfault. Shown below.
There is not crash report or anything shown in the logs. if I turn on debug log it shows "segmentation fault".
Please let me know if you need more information.

#0  0x00005555557afdd3 in icinga::intrusive_ptr_release(icinga::Object*) (object=<optimized out>,
    object=<optimized out>) at ./lib/base/object.cpp:284
#1  0x0000555555825882 in boost::intrusive_ptr<icinga::Object>::~intrusive_ptr() ()
    at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:98
#2  boost::intrusive_ptr<icinga::Object>::operator=(boost::intrusive_ptr<icinga::Object> const&) (
    this=0x7fffa806c360, rhs=...) at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:154
#3  0x00005555557ee6b5 in boost::detail::variant::assign_storage::internal_visit<boost::intrusive_ptr<icinga::Object> >(boost::intrusive_ptr<icinga::Object>&, int) const () at /usr/include/boost/variant/variant.hpp:564
#4  boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::assign_storage, void*, boost::intrusive_ptr<icinga::Object> >(int, boost::detail::variant::assign_storage&, void*, boost::intrusive_ptr<icinga::Object>*, mpl_::bool_<true>) () at /usr/include/boost/variant/detail/visitation_impl.hpp:114
#5  boost::detail::variant::visitation_impl_invoke<boost::detail::variant::assign_storage, void*, boost::intrusive_ptr<icinga::Object>, boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::has_fallback_type_>(int, boost::detail::variant::assign_storage&, void*, boost::intrusive_ptr<icinga::Object>*, boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::has_fallback_type_, int) () at /usr/include/boost/variant/detail/visitation_impl.hpp:154
#6  visitation_impl (internal_which=<optimized out>, no_backup_flag=..., storage=0x7fffa806c360,
    visitor=<synthetic pointer>..., logical_which=<optimized out>)
    at /usr/include/boost/variant/detail/visitation_impl.hpp:238
#7  boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::internal_apply_visitor_impl<boost::detail::variant::assign_storage, void*>(int, int, boost::detail::variant::assign_storage&, void*) () at /usr/include/boost/variant/variant.hpp:2392
#8  internal_apply_visitor (visitor=<synthetic pointer>..., this=0x7fffa806c358)
    at /usr/include/boost/variant/variant.hpp:2406
#9  boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::variant_assign(boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> > const&) (
    this=0x7fffa806c358, rhs=...) at /usr/include/boost/variant/variant.hpp:2115
#10 0x00005555557b23c3 in icinga::Namespace::SetFieldByName(icinga::String const&, icinga::Value const&, bool, icinga::DebugInfo const&) (this=0x5555560c95d0, field=..., value=..., overrideFrozen=<optimized out>, debugInfo=...)
    at ./lib/base/namespace.cpp:121
#11 0x00005555557b2993 in icinga::Namespace::Set(icinga::String const&, icinga::Value const&, bool) (
    this=<optimized out>, field=..., value=..., overrideFrozen=<optimized out>) at ./lib/base/namespace.cpp:59
#12 0x0000555555880a46 in icinga::FilterUtility::EvaluateFilter(icinga::ScriptFrame&, icinga::Expression*, boost::intrusive_ptr<icinga::Object> const&, icinga::String const&) (frame=..., filter=0x7ffee40019f0, target=...,
    variableName=...) at ./lib/remote/filterutility.cpp:123
#13 0x0000555555880dc4 in FilteredAddTarget (permissionFrame=..., permissionFilter=<optimized out>, frame=...,
    ufilter=<optimized out>, result=..., variableName=..., target=..., target=..., variableName=..., result=...,
    ufilter=<optimized out>, frame=...) at ./lib/remote/filterutility.cpp:134
#14 0x000055555590c9cf in _ZSt13__invoke_implIvRPFvRN6icinga11ScriptFrameEPNS0_10ExpressionES2_S4_RSt6vectorINS0_5ValueESaIS6_EERKNS0_6StringERKN5boost13intrusive_ptrINS0_6ObjectEEEEJS2_RS4_S2_RDnS9_RSA_RKS6_EET_St14__invoke_otherOT0_DpOT1_.isra.2945 (__args#6=..., __args#5=..., __args#4=..., __args#2=...,
    __args#1=@0x7ffee40015b0: 0x7ffee40019f0, __args#0=...) at /usr/include/c++/7/bits/invoke.h:60
#15 std::__invoke<void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&>(void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&) () at /usr/include/c++/7/bits/invoke.h:95
#16 _ZNSt5_BindIFPFvRN6icinga11ScriptFrameEPNS0_10ExpressionES2_S4_RSt6vectorINS0_5ValueESaIS6_EERKNS0_6StringERKN5boost13intrusive_ptrINS0_6ObjectEEEESt17reference_wrapperIS1_ES4_SM_DnSL_IS8_ESA_St12_PlaceholderILi1EEEE6__callIvJRKS6_EJLm0ELm1ELm2ELm3ELm4ELm5ELm6EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE.isra.2947 (__args=...,
    this=0x7ffee4001570) at /usr/include/c++/7/functional:467
#17 std::_Bind<void (*(std::reference_wrapper<icinga::ScriptFrame>, icinga::Expression*, std::reference_wrapper<icinga::ScriptFrame>, decltype(nullptr), std::reference_wrapper<std::vector<icinga::Value, std::allocator<icinga::Value> > >, icinga::String, std::_Placeholder<1>))(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&)>::operator()<icinga::Value const&, void>(icinga::Value const&) ()
    at /usr/include/c++/7/functional:551
#18 std::_Function_handler<void (icinga::Value const&), std::_Bind<void (*(std::reference_wrapper<icinga::ScriptFrame>, icinga::Expression*, std::reference_wrapper<icinga::ScriptFrame>, decltype(nullptr), std::reference_wrapper<std::vector<icinga::Value, std::allocator<icinga::Value> > >, icinga::String, std::_Placeholder<1>))(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&)> >::_M_invoke(std::_Any_data const&, icinga::Value const&) (__functor=..., __args#0=...) at /usr/include/c++/7/bits/std_function.h:316
#19 0x00005555558891f0 in std::function<void (icinga::Value const&)>::operator()(icinga::Value const&) const ()
    at /usr/include/c++/7/bits/std_function.h:706
#20 icinga::ConfigObjectTargetProvider::FindTargets(icinga::String const&, std::function<void (icinga::Value const&)> const&) const (this=this@entry=0x7ffee4001ae0, type=..., addTarget=...) at ./lib/remote/filterutility.cpp:55
#21 0x0000555555882395 in icinga::FilterUtility::GetFilterTargets(icinga::QueryDescription const&, boost::intrusive_ptr<icinga::Dictionary> const&, boost::intrusive_ptr<icinga::ApiUser> const&, icinga::String const&) (qd=...,
    query=..., user=..., variableName=...) at ./lib/remote/filterutility.cpp:283
---Type <return> to continue, or q <return> to quit---
#22 0x0000555555885cc0 in icinga::ObjectQueryHandler::HandleRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::Dictionary> const&) (this=<optimized out>,
    user=..., request=..., response=..., params=...) at ./lib/remote/objectqueryhandler.cpp:165
#23 0x000055555587bda4 in icinga::HttpHandler::ProcessRequest(boost::intrusive_ptr<icinga::ApiUser> const&, icinga::HttpRequest&, icinga::HttpResponse&) (user=..., request=..., response=...) at ./lib/remote/httphandler.cpp:109
#24 0x000055555587c4c6 in icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&) (this=0x7ffe7c0062f0, request=..., response=..., user=...)
    at ./lib/remote/httpserverconnection.cpp:338
#25 0x00005555558235fe in std::function<void ()>::operator()() const ()
    at /usr/include/c++/7/bits/std_function.h:706
#26 icinga::WorkQueue::RunTaskFunction(std::function<void ()> const&) (this=this@entry=0x7ffe7c0063b8, func=...)
    at ./lib/base/workqueue.cpp:253
#27 0x0000555555836b39 in icinga::WorkQueue::WorkerThreadProc() (this=<optimized out>)
    at ./lib/base/workqueue.cpp:296
#28 0x00007ffff75ccbcd in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#29 0x00007ffff6c156db in start_thread (arg=0x7fff2bbf2700) at pthread_create.c:463
#30 0x00007ffff7b0588f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Possible Solution

No solution yet.

Steps to Reproduce (for bugs)

  1. install a icinga2 instance
  2. configure with 'icinga2 node setup master --zone master'
  3. setup 9 api users
  4. connect to 2 endpoints (hosts and downtimes) with all 9 users every 1 or 2 seconds with http authetication. with curl for example.

Context

We are trying to setup a icinga2 cluster. and created our own dashboard to view open issues on some monitors in our office.
We have 10 Api users configured to make this happen all with separate filters.

Your Environment

  • Version used (icinga2 --version): 2.10.2-1
  • Operating System and version: 18.04.1 LTS
  • Enabled features (icinga2 feature list): api checker debuglog ido-mysql mainlog
  • Icinga Web 2 version and modules (System - About): N/A
  • Config validation (icinga2 daemon -C): succeeds without problems.

We run a master with 14 satellites with around 200 nodes.
also we run a icingaweb on a separate server. and a ido-mysql on a separate server aswell.

@jottekop
Copy link
Author

jottekop commented Jan 4, 2019

#6875 looks the same to me

@hansmi
Copy link

hansmi commented Jan 4, 2019

Something is wrong indeed. We noticed that sending many requests to /v1/actions/schedule-downtime or /v1/actions/remove-downtime can trigger segmentation faults and/or assertion failures in glibc. One particular backtrace triggered in Icinga running with MALLOC_CHECK_=255 (unfortunately the timing and reproducibility are not what one would wish):

0x00007ff54dac6428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ff54dac6428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ff54dac802a in __GI_abort () at abort.c:89
#2  0x00007ff54dabebd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7ff54ccd3015 "mutex->__data.__owner == 0", file=file@entry=0x7ff54ccd2ff8 "../nptl/pthread_mutex_lock.c", line=line@entry=117, 
    function=function@entry=0x7ff54ccd3180 <__PRETTY_FUNCTION__.8623> "__pthread_mutex_lock") at assert.c:92
#3  0x00007ff54dabec82 in __GI___assert_fail (assertion=assertion@entry=0x7ff54ccd3015 "mutex->__data.__owner == 0", file=file@entry=0x7ff54ccd2ff8 "../nptl/pthread_mutex_lock.c", line=line@entry=117, 
    function=function@entry=0x7ff54ccd3180 <__PRETTY_FUNCTION__.8623> "__pthread_mutex_lock") at assert.c:101
#4  0x00007ff54ccc9fb2 in __GI___pthread_mutex_lock (mutex=<optimized out>) at ../nptl/pthread_mutex_lock.c:117
#5  0x00000000005d6095 in lock (this=<optimized out>) at /usr/include/boost/thread/pthread/recursive_mutex.hpp:113
#6  icinga::ObjectLock::LockMutex(icinga::Object const*) (object=object@entry=0x7ff5245f3870) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/objectlock.cpp:62
#7  0x00000000005ae611 in icinga::ObjectLock::Lock() () at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/objectlock.cpp:88
#8  icinga::ObjectLock::ObjectLock(icinga::Object const*) () at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/objectlock.cpp:44
#9  icinga::Dictionary::Set(icinga::String const&, icinga::Value, bool) (this=0x7ff5245f3870, key=..., value=..., overrideFrozen=overrideFrozen@entry=false)
    at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/dictionary.cpp:96
#10 0x000000000067781b in icinga::ApiListener::SyncRelayMessage(boost::intrusive_ptr<icinga::MessageOrigin> const&, boost::intrusive_ptr<icinga::ConfigObject> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool) (
    this=0x7ff528005b50, origin=..., secobj=..., message=..., log=<optimized out>) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/remote/apilistener.cpp:1016
#11 0x000000000060419e in std::function<void ()>::operator()() const () at /usr/include/c++/5/functional:2267
#12 icinga::WorkQueue::RunTaskFunction(std::function<void ()> const&) (this=this@entry=0x7ff528005f10, func=...) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/workqueue.cpp:253
#13 0x0000000000620147 in icinga::WorkQueue::WorkerThreadProc() (this=0x7ff528005f10) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/workqueue.cpp:296
#14 0x00007ff54d6785d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#15 0x00007ff54ccc76ba in start_thread (arg=0x7ff533bda700) at pthread_create.c:333
#16 0x00007ff54db9841d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Another (callees trimmed):

[…]
#53 icinga::Dictionary::~Dictionary() [clone .lto_priv.4614] () at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/dictionary.hpp:40
#54 icinga::Dictionary::~Dictionary (this=0x6237060) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/dictionary.hpp:40
#55 0x0000000000683382 in boost::intrusive_ptr<icinga::Endpoint>::~intrusive_ptr () at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:97
#56 std::_Head_base<2ul, boost::intrusive_ptr<icinga::Endpoint>, false>::~_Head_base () at /usr/include/c++/5/tuple:102
#57 std::_Tuple_impl<2ul, boost::intrusive_ptr<icinga::Endpoint>, bool>::~_Tuple_impl () at /usr/include/c++/5/tuple:180
#58 std::_Tuple_impl<1ul, boost::intrusive_ptr<icinga::JsonRpcConnection>, boost::intrusive_ptr<icinga::Endpoint>, bool>::~_Tuple_impl() [clone .lto_priv.6585] (this=0x7fb35e0) at /usr/include/c++/5/tuple:180
#59 std::_Tuple_impl::__base_dtor (this=0x7fb35e0) at /usr/include/c++/5/tuple:180
#60 std::_Tuple_impl<0ul, icinga::ApiListener*, boost::intrusive_ptr<icinga::MessageOrigin>, boost::intrusive_ptr<icinga::ConfigObject>, boost::intrusive_ptr<icinga::Dictionary>, bool>::~_Tuple_impl () at /usr/include/c++/5/tuple:180
#61 std::tuple<icinga::ApiListener*, boost::intrusive_ptr<icinga::MessageOrigin>, boost::intrusive_ptr<icinga::ConfigObject>, boost::intrusive_ptr<icinga::Dictionary>, bool>::~tuple () at /usr/include/c++/5/tuple:463
#62 std::_Bind<std::_Mem_fn<void (icinga::ApiListener::*)(boost::intrusive_ptr<icinga::MessageOrigin> const&, boost::intrusive_ptr<icinga::ConfigObject> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool)> (icinga::ApiListener*, boost::intrusive_ptr<icinga::MessageOrigin>, boost::intrusive_ptr<icinga::ConfigObject>, boost::intrusive_ptr<icinga::Dictionary>, bool)>::~_Bind() () at /usr/include/c++/5/functional:1058
#63 std::_Function_base::_Base_manager::_ZNSt14_Function_base13_Base_managerISt5_BindIFSt7_Mem_fnIMN6icinga11ApiListenerEFvRKN5boost13intrusive_ptrINS3_13MessageOriginEEERKNS6_INS3_12ConfigObjectEEERKNS6_INS3_10DictionaryEEEbEEPS4_S8_SC_SG_bEEE10_M_destroyERSt9_Any_dataSt17integral_constantIbLb0EE.isra.534 (__victim=...) at /usr/include/c++/5/functional:1726
#64 std::_Function_base::_Base_manager<std::_Bind<std::_Mem_fn<void (icinga::ApiListener::*)(boost::intrusive_ptr<icinga::MessageOrigin> const&, boost::intrusive_ptr<icinga::ConfigObject> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool)> (icinga::ApiListener*, boost::intrusive_ptr<icinga::MessageOrigin>, boost::intrusive_ptr<icinga::ConfigObject>, boost::intrusive_ptr<icinga::Dictionary>, bool)> >::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation) (__dest=..., __source=..., __op=<optimized out>) at /usr/include/c++/5/functional:1750
#65 0x000000000062022c in std::_Function_base::~_Function_base() () at /usr/include/c++/5/functional:1830
#66 std::function<void ()>::~function() () at /usr/include/c++/5/functional:1974
#67 std::function<void ()>::operator=(std::function<void ()>&&) (__x=<unknown type in /usr/lib/debug/.build-id/b8/5d4a2d0c467a27642a68954ea1a63024c27681.debug, CU 0x2643a7, DIE 0x293f21>, this=0x7f29cfc9da40) at /usr/include/c++/5/functional:2089
#68 icinga::Task::operator=(icinga::Task&&) () at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/workqueue.hpp:46
#69 icinga::WorkQueue::WorkerThreadProc() (this=0x61fc270) at /home/jenkins/workspace/icinga2-release/deb-ubuntu-xenial-1binary/arch/x86_64/icinga2/lib/base/workqueue.cpp:299
#70 0x00007f29d57065d5 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.58.0
#71 0x00007f29d4d556ba in start_thread (arg=0x7f29cfc9e700) at pthread_create.c:333
#72 0x00007f29d5c2641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@Crunsher Crunsher added area/api REST API core/crash Shouldn't happen, requires attention labels Jan 7, 2019
@lippserd
Copy link
Member

lippserd commented Jan 7, 2019

Which HTTP libs are you guys using for your scripts? Do you or the libs send the Connection: close header?

@hansmi
Copy link

hansmi commented Jan 7, 2019

@lippserd My tests were using cURL without explicit parameters to keep the connection alive and there was one request per program invocation, so it'd be closed after one request anyway. I also tried using ab which wasn't successful in crashing Icinga, though considering that it was in a local test system it may have been related to the little workload compared to the staging machine where I gathered the stack traces.

The Icinga version in question is r2.10.2-1 from the Ubuntu Xenial package version 2.10.2-1.xenial.

@jottekop
Copy link
Author

jottekop commented Jan 9, 2019

@lippserd
We are using the php guzzle http library for symfony. this library automatically closes requests.

@Al2Klimov
Copy link
Member

I'd suggest to wait for v2.10.3 and re-test with that version.

@hansmi
Copy link

hansmi commented Jan 10, 2019

@Al2Klimov What makes you believe that 2.10.3 would bring an improvement?

@Al2Klimov
Copy link
Member

v2.10.3 will bring a bunch of improvements, maybe there's also something for you.

@lippserd
Copy link
Member

Could you guys test our snapshot packages? This may be fixed already.

@jottekop
Copy link
Author

@lippserd you have a sort of guide to install the snapshot version?

@hrak
Copy link

hrak commented Jan 14, 2019

@lippserd you have a sort of guide to install the snapshot version?

Go to https://icinga.com/download/ and select your distro, see the info under the 'Snapshot Builds' section.

@hansmi
Copy link

hansmi commented Jan 14, 2019

Unfortunately Icinga2 v2.10.2-160-g1c772aa installed on Ubuntu Xenial still crashes, this time in icinga::IdoMysqlConnection::FieldToEscapedString. As should be obvious we aren't dealing with a predictable issue. How can we be of assistance in debugging this?

@jottekop
Copy link
Author

Same goes for Bionic 2.10.2+160.g1c772aac5.2019.01.12+1.bionic-0

I still get the same issue as in the original post.

Is there any steps you want me to take now

@jottekop
Copy link
Author

jottekop commented Jan 15, 2019

@lippserd I have installed 2.9.2-1.bionic and this seems to work without problems.

Edit: our api calls do not work in 2.10.0 so could not properly test that. but 2.10.1 is also crashing

@hrak
Copy link

hrak commented Jan 15, 2019

My shot-in-the-dark guess is that it has something to do with changes introduced in #6596 and the fact that frame.Self does not get set within the else condition at https://github.com/Icinga/icinga2/pull/6596/files#diff-a500132058c49e3d780348743914943fR271

@hrak
Copy link

hrak commented Jan 18, 2019

For clarity: I am a teammate of @jottekop so we are dealing with the same issue.

My shot in the dark was wrong. I have compiled a debug build of the master branch on a test vm and ran it in gdb to get backtraces with some more info.

I reconstructed part of the config in a test setup with just one API user and i could hammer it endlessly in an endless bash while loop with curl calls in the same sequence as we do in the dashboard that crashes the icinga API.

The only way i could trigger this segfault was to transfer my debug binary to the prod box and run it in gdb there, where we have a fair volume of API calls coming in from various users. The api-users config looks like this:

/**
 * The APIUser objects are used for authentication against the API.
 */
object ApiUser "lswApiRoot" {
  password = "blah"

  // client_cn = ""

  permissions = [ "*" ]
}

/**
 * The APIUser for dashboard
 */
object ApiUser "svc-bc" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^BC", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-sh" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^SH", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-devp" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^DEVP", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-ws" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^WS", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-cp" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^CP", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-cdn" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^CDN", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-cloud" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^CLOUD", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-back" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^BACK", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}
object ApiUser "svc-bm" {
  password = "blah"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ regex("^BM", host.groups) }}
  },
  {
   permission = "status/query"
  }
]
}

It looks like two threads are deleting the same object at the same time. Thread 3123 and 3121 both are deleting an icinga object at 0x7fff88021b80 with a refcount of 0.

My gdb session:

[New Thread 0x7ffff06e5700 (LWP 2144)]
[2019-01-18 15:30:34 +0000] information/ApiListener: New client connection from [xx.xx.xx.xx]:38688 (no client certificate)
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: Request: GET /v1/objects/services (from [xx.xx.xx.xx]:38688), user: svc-bc)
[New Thread 0x7ffff055f700 (LWP 2145)]
[New Thread 0x7ffff05a0700 (LWP 2146)]
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: HTTP client disconnected (from [xx.xx.xx.xx]:38688)
[Thread 0x7ffff055f700 (LWP 2145) exited]
[New Thread 0x7ffff055f700 (LWP 2147)]
[2019-01-18 15:30:34 +0000] information/ApiListener: New client connection from [xx.xx.xx.xx]:38690 (no client certificate)
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: Request: GET /v1/objects/hosts (from [xx.xx.xx.xx]:38690), user: svc-sh)
[New Thread 0x7ffff0f87700 (LWP 2148)]
[2019-01-18 15:30:34 +0000] information/ApiListener: New client connection from [xx.xx.xx.xx]:38692 (no client certificate)
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: Request: GET /v1/objects/hosts (from [xx.xx.xx.xx]:38692), user: svc-sh)
[New Thread 0x7ffff086b700 (LWP 2149)]
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: HTTP client disconnected (from [xx.xx.xx.xx]:38692)
[Thread 0x7ffff086b700 (LWP 2149) exited]
[New Thread 0x7ffff086b700 (LWP 2150)]
[2019-01-18 15:30:34 +0000] information/ApiListener: New client connection from [xx.xx.xx.xx]:38694 (no client certificate)
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: Request: GET /v1/objects/services (from [xx.xx.xx.xx]:38694), user: svc-sh)
[New Thread 0x7ffff07e9700 (LWP 2151)]
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: HTTP client disconnected (from [xx.xx.xx.xx]:38690)
[Thread 0x7ffff0f87700 (LWP 2148) exited]
[New Thread 0x7ffff0f87700 (LWP 2152)]
[Thread 0x7ffff0e83700 (LWP 2142) exited]
[Thread 0x7ffff082a700 (LWP 1729) exited]
[Thread 0x7ffff06e5700 (LWP 2144) exited]
[Thread 0x7ffff05a0700 (LWP 2146) exited]
[2019-01-18 15:30:34 +0000] information/ApiListener: New client connection from [xx.xx.xx.xx]:38696 (no client certificate)
[2019-01-18 15:30:34 +0000] information/HttpServerConnection: Request: GET /v1/objects/services (from [xx.xx.xx.xx]:38696), user: svc-sh)
[New Thread 0x7ffff05a0700 (LWP 2153)]

Thread 3121 "icinga2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff07e9700 (LWP 2151)]
0x000055555689837c in icinga::intrusive_ptr_release (object=0x7fff88021b80) at /root/icinga2/lib/base/object.cpp:284
284			delete object;
(gdb) print *object
$1 = {_vptr.Object = 0x0, static TypeInstance = {px = 0x555557a27c20}, m_References = 0, m_Mutex = 93825040458112, m_LockOwner = 0, m_LockCount = 0}
(gdb) print refs
$2 = 0
(gdb) bt
#0  0x000055555689837c in icinga::intrusive_ptr_release (object=0x7fff88021b80) at /root/icinga2/lib/base/object.cpp:284
#1  0x00005555567ba0cf in boost::intrusive_ptr<icinga::Object>::~intrusive_ptr (this=0x7ffff07e75b0, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:98
#2  0x00005555568d06df in boost::intrusive_ptr<icinga::Object>::operator= (this=0x7fffe00844d0, rhs=...)
    at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:154
#3  0x000055555694ca79 in boost::detail::variant::assign_storage::internal_visit<boost::intrusive_ptr<icinga::Object> > (this=0x7ffff07e77b0,
    lhs_content=...) at /usr/include/boost/variant/variant.hpp:564
#4  0x000055555694b2b1 in boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::assign_storage, void*, boost::intrusive_ptr<icinga::Object> > (visitor=..., storage=0x7fffe00844d0) at /usr/include/boost/variant/detail/visitation_impl.hpp:114
#5  0x0000555556949239 in boost::detail::variant::visitation_impl_invoke<boost::detail::variant::assign_storage, void*, boost::intrusive_ptr<icinga::Object>, boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::has_fallback_type_> (internal_which=4,
    visitor=..., storage=0x7fffe00844d0, t=0x0) at /usr/include/boost/variant/detail/visitation_impl.hpp:154
#6  0x0000555556947468 in boost::detail::variant::visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<5l>, boost::blank, boost::mpl::l_item<mpl_::long_<4l>, double, boost::mpl::l_item<mpl_::long_<3l>, bool, boost::mpl::l_item<mpl_::long_<2l>, icinga::String, boost::mpl::l_item<mpl_::long_<1l>, boost::intrusive_ptr<icinga::Object>, boost::mpl::l_end> > > > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::assign_storage, void*, boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::has_fallback_type_> (internal_which=4, logical_which=4, visitor=..., storage=0x7fffe00844d0, no_backup_flag=...)
    at /usr/include/boost/variant/detail/visitation_impl.hpp:238
#7  0x0000555556945dc4 in boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::internal_apply_visitor_impl<boost::detail::variant::assign_storage, void*> (internal_which=4, logical_which=4, visitor=..., storage=0x7fffe00844d0)
    at /usr/include/boost/variant/variant.hpp:2392
#8  0x00005555569456a0 in boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::internal_apply_visitor<boost::detail::variant::assign_storage> (this=0x7fffe00844c8, visitor=...) at /usr/include/boost/variant/variant.hpp:2406
#9  0x0000555556944b68 in boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::variant_assign (
    this=0x7fffe00844c8, rhs=...) at /usr/include/boost/variant/variant.hpp:2115
#10 0x0000555556944c8b in boost::variant<boost::blank, double, bool, icinga::String, boost::intrusive_ptr<icinga::Object> >::operator= (
    this=0x7fffe00844c8, rhs=...) at /usr/include/boost/variant/variant.hpp:2218
#11 0x0000555556943f35 in icinga::Value::operator= (this=0x7fffe00844c8, other=...) at /root/icinga2/lib/base/value.cpp:107
#12 0x000055555688dd1a in icinga::EmbeddedNamespaceValue::Set (this=0x7fffe00844c0, value=...) at /root/icinga2/lib/base/namespace.cpp:151
#13 0x000055555688da2e in icinga::Namespace::SetFieldByName (this=0x5555579f5980, field=..., value=..., overrideFrozen=false, debugInfo=...)
    at /root/icinga2/lib/base/namespace.cpp:121
#14 0x000055555688d488 in icinga::Namespace::Set (this=0x5555579f5980, field=..., value=..., overrideFrozen=false)
    at /root/icinga2/lib/base/namespace.cpp:59
#15 0x0000555556a927f2 in icinga::FilterUtility::EvaluateFilter (frame=..., filter=0x7fffec034690, target=..., variableName=...)
    at /root/icinga2/lib/remote/filterutility.cpp:123
#16 0x0000555556a92a94 in FilteredAddTarget (permissionFrame=..., permissionFilter=0x7fffec034690, frame=..., ufilter=0x0,
    result=std::vector of length 353, capacity 512 = {...}, variableName=..., target=...) at /root/icinga2/lib/remote/filterutility.cpp:134
#17 0x0000555556a98b5d in std::__invoke_impl<void, void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&>(std::__invoke_other, void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&) (
    __f=@0x7fffec079df0: 0x555556a92a59 <FilteredAddTarget(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, icinga::Object::Ptr const&)>, __args#0=...,
    __args#1=@0x7fffec079e30: 0x7fffec034690, __args#2=..., __args#3=<error reading variable>, __args#4=std::vector of length 353, capacity 512 = {...},
    __args#5=..., __args#6=...) at /usr/include/c++/7/bits/invoke.h:60
#18 0x0000555556a985ff in std::__invoke<void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&>(void (*&)(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&), icinga::ScriptFrame&, icinga::Expression*&, icinga::ScriptFrame&, decltype(nullptr)&, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String&, icinga::Value const&) (
    __fn=@0x7fffec079df0: 0x555556a92a59 <FilteredAddTarget(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, icinga::Object::Ptr const&)>, __args#0=...,
    __args#1=@0x7fffec079e30: 0x7fffec034690, __args#2=..., __args#3=<error reading variable>, __args#4=std::vector of length 353, capacity 512 = {...},
    __args#5=..., __args#6=...) at /usr/include/c++/7/bits/invoke.h:95
#19 0x0000555556a97fe0 in std::_Bind<void (*(std::reference_wrapper<icinga::ScriptFrame>, icinga::Expression*, std::reference_wrapper<icinga::ScriptFrame>, decltype(nullptr), std::reference_wrapper<std::vector<icinga::Value, std::allocator<icinga::Value> > >, icinga::String, std::_Placeholder<1>))(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&)>::__call<void, icinga::Value const&, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul>(std::tuple<icinga::Value const&>&&, std::_Index_tuple<0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul>) (this=0x7fffec079df0, __args=...) at /usr/include/c++/7/functional:467
#20 0x0000555556a978d2 in std::_Bind<void (*(std::reference_wrapper<icinga::ScriptFrame>, icinga::Expression*, std::reference_wrapper<icinga::ScriptFrame>, decltype(nullptr), std::reference_wrapper<std::vector<icinga::Value, std::allocator<icinga::Value> > >, icinga::String, std::_Placeholder<1>))(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&)>::operator()<icinga::Value const&, void>(icinga::Value const&) (this=0x7fffec079df0, __args#0=...)
    at /usr/include/c++/7/functional:551
#21 0x0000555556a96fe5 in std::_Function_handler<void (icinga::Value const&), std::_Bind<void (*(std::reference_wrapper<icinga::ScriptFrame>, icinga::Expression*, std::reference_wrapper<icinga::ScriptFrame>, decltype(nullptr), std::reference_wrapper<std::vector<icinga::Value, std::allocator<icinga::Value> > >, icinga::String, std::_Placeholder<1>))(icinga::ScriptFrame&, icinga::Expression*, icinga::ScriptFrame&, icinga::Expression*, std::vector<icinga::Value, std::allocator<icinga::Value> >&, icinga::String const&, boost::intrusive_ptr<icinga::Object> const&)> >::_M_invoke(std::_Any_data const&, icinga::Value const&) (__functor=..., __args#0=...) at /usr/include/c++/7/bits/std_function.h:316
#22 0x0000555556a95157 in std::function<void (icinga::Value const&)>::operator()(icinga::Value const&) const (this=0x7ffff07e7f70, __args#0=...)
    at /usr/include/c++/7/bits/std_function.h:706
#23 0x0000555556a92128 in icinga::ConfigObjectTargetProvider::FindTargets(icinga::String const&, std::function<void (icinga::Value const&)> const&) const
---Type <return> to continue, or q <return> to quit---
    (this=0x7fffec02b140, type=..., addTarget=...) at /root/icinga2/lib/remote/filterutility.cpp:55
#24 0x0000555556a94779 in icinga::FilterUtility::GetFilterTargets (qd=..., query=..., user=..., variableName=...)
    at /root/icinga2/lib/remote/filterutility.cpp:283
#25 0x0000555556abc183 in icinga::ObjectQueryHandler::HandleRequest (this=0x555557a5b230, user=..., request=..., response=..., params=...)
    at /root/icinga2/lib/remote/objectqueryhandler.cpp:165
#26 0x0000555556a9f017 in icinga::HttpHandler::ProcessRequest (user=..., request=..., response=...) at /root/icinga2/lib/remote/httphandler.cpp:109
#27 0x0000555556aa89e8 in icinga::HttpServerConnection::ProcessMessageAsync (this=0x7ffee40092c0, request=..., response=..., user=...)
    at /root/icinga2/lib/remote/httpserverconnection.cpp:338
#28 0x0000555556aac32a in std::__invoke_impl<void, void (icinga::HttpServerConnection::*&)(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&), boost::intrusive_ptr<icinga::HttpServerConnection>&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser>&> (__f=
    @0x7fffec084200: (void (icinga::HttpServerConnection::*)(icinga::HttpServerConnection * const, icinga::HttpRequest &, icinga::HttpResponse &, const boost::intrusive_ptr<icinga::ApiUser> &)) 0x555556aa8978 <icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&)>, __t=..., __args#0=..., __args#1=..., __args#2=...) at /usr/include/c++/7/bits/invoke.h:73
#29 0x0000555556aabf8c in std::__invoke<void (icinga::HttpServerConnection::*&)(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&), boost::intrusive_ptr<icinga::HttpServerConnection>&, icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser>&> (__fn=
    @0x7fffec084200: (void (icinga::HttpServerConnection::*)(icinga::HttpServerConnection * const, icinga::HttpRequest &, icinga::HttpResponse &, const boost::intrusive_ptr<icinga::ApiUser> &)) 0x555556aa8978 <icinga::HttpServerConnection::ProcessMessageAsync(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&)>, __args#0=..., __args#1=..., __args#2=..., __args#3=...) at /usr/include/c++/7/bits/invoke.h:95
#30 0x0000555556aabb1d in std::_Bind<void (icinga::HttpServerConnection::*(boost::intrusive_ptr<icinga::HttpServerConnection>, icinga::HttpRequest, icinga::HttpResponse, boost::intrusive_ptr<icinga::ApiUser>))(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&)>::__call<void, , 0ul, 1ul, 2ul, 3ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul, 2ul, 3ul>) (this=0x7fffec084200, __args=...)
    at /usr/include/c++/7/functional:467
#31 0x0000555556aab09d in std::_Bind<void (icinga::HttpServerConnection::*(boost::intrusive_ptr<icinga::HttpServerConnection>, icinga::HttpRequest, icinga::HttpResponse, boost::intrusive_ptr<icinga::ApiUser>))(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&)>::operator()<, void>() (this=0x7fffec084200) at /usr/include/c++/7/functional:551
#32 0x0000555556aaa63f in std::_Function_handler<void (), std::_Bind<void (icinga::HttpServerConnection::*(boost::intrusive_ptr<icinga::HttpServerConnection>, icinga::HttpRequest, icinga::HttpResponse, boost::intrusive_ptr<icinga::ApiUser>))(icinga::HttpRequest&, icinga::HttpResponse&, boost::intrusive_ptr<icinga::ApiUser> const&)> >::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316
#33 0x000055555687564a in std::function<void ()>::operator()() const (this=0x7ffff07e8bb0) at /usr/include/c++/7/bits/std_function.h:706
#34 0x0000555556961a7a in icinga::WorkQueue::RunTaskFunction(std::function<void ()> const&) (this=0x7ffee4009398, func=...)
    at /root/icinga2/lib/base/workqueue.cpp:253
#35 0x0000555556961e51 in icinga::WorkQueue::WorkerThreadProc (this=0x7ffee4009398) at /root/icinga2/lib/base/workqueue.cpp:296
#36 0x000055555696772e in std::__invoke_impl<void, void (icinga::WorkQueue::*&)(), icinga::WorkQueue*&> (
    __f=@0x7ffee4009de8: (void (icinga::WorkQueue::*)(icinga::WorkQueue * const)) 0x555556961bbe <icinga::WorkQueue::WorkerThreadProc()>,
    __t=@0x7ffee4009df8: 0x7ffee4009398) at /usr/include/c++/7/bits/invoke.h:73
#37 0x000055555696765e in std::__invoke<void (icinga::WorkQueue::*&)(), icinga::WorkQueue*&> (
    __fn=@0x7ffee4009de8: (void (icinga::WorkQueue::*)(icinga::WorkQueue * const)) 0x555556961bbe <icinga::WorkQueue::WorkerThreadProc()>,
    __args#0=@0x7ffee4009df8: 0x7ffee4009398) at /usr/include/c++/7/bits/invoke.h:95
#38 0x0000555556967a63 in std::_Bind<void (icinga::WorkQueue::*(icinga::WorkQueue*))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) (
    this=0x7ffee4009de8, __args=...) at /usr/include/c++/7/functional:467
#39 0x00005555569679d1 in std::_Bind<void (icinga::WorkQueue::*(icinga::WorkQueue*))()>::operator()<, void>() (this=0x7ffee4009de8)
    at /usr/include/c++/7/functional:551
#40 0x0000555556967848 in boost::detail::thread_data<std::_Bind<void (icinga::WorkQueue::*(icinga::WorkQueue*))()> >::run() (this=0x7ffee4009c30)
    at /usr/include/boost/thread/detail/thread.hpp:116
#41 0x00007ffff79bdbcd in ?? () from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.65.1
#42 0x00007ffff70066db in start_thread (arg=0x7ffff07e9700) at pthread_create.c:463
#43 0x00007ffff58cd88f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Attached is the full thread backtrace
icinga_bt_all_threads.txt

@dnsmichi
Copy link
Contributor

Thanks, that helps. Although such a scenario must not happen that both intrusive_ptrs decide to release the object. Namespaces were added recently, that's somewhere a programming error or a compiler optimization.

34de810 plays a role here, fixing a filter regression. Next week, my schedule is full but I'll try to look into it afterwards.

@hansmi
Copy link

hansmi commented Feb 4, 2019

@dnsmichi, do you have any update on this issue?

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 4, 2019

No, unfortunately not. I am dealing with other customer issues this week.

@lippserd lippserd added the blocker Blocks a release or needs immediate attention label Feb 6, 2019
@lippserd
Copy link
Member

lippserd commented Feb 6, 2019

We're little short on resources at the moment but we try to help you as soon as possible.

@dnsmichi
Copy link
Contributor

This would be totally ok, but for some reason the object already is deleted.

[Switching to Thread 0x7ffff07e9700 (LWP 2151)]
0x000055555689837c in icinga::intrusive_ptr_release (object=0x7fff88021b80) at /root/icinga2/lib/base/object.cpp:284
284			delete object;
(gdb) print *object
$1 = {_vptr.Object = 0x0, static TypeInstance = {px = 0x555557a27c20}, m_References = 0, m_Mutex = 93825040458112, m_LockOwner = 0, m_LockCount = 0}
(gdb) print refs
$2 = 0

This originates from the namespace introduction in 2.10 in 34de810 where the frame was changed from a dictionary to a namespace. It may also be influenced with the changes from 7f7e81d

I'm wondering whether frame.Self.IsEmpty() translates to the correct behaviour here. In case you can reproduce the problem easily, step up in the stacktrace and print the content of frame.Self if possible. And look which if branch it hit inside EvaluateFilter().

The missing frame.Self assignment in FilterUtility::GetFilterTargets in the else branch shouldn't matter, since EvaluateFilter() will create a new Namespace object on its own. But maybe the culprit lies exactly there.

I'm off for now, will deal with this in CW11 the soonest again.

Cheers,
Michael

@hrak
Copy link

hrak commented Feb 28, 2019

Its fairly easily reproducible for me so whatever info you need, let me know.

(gdb) up
#15 0x0000555556a927f2 in icinga::FilterUtility::EvaluateFilter (frame=..., filter=0x7ffeb8007600, target=..., variableName=...)
    at /root/icinga2/lib/remote/filterutility.cpp:123
123				frameNS->Set(field.NavigationName, joinedObj);
(gdb) print frame.Self
$1 = {m_Value = {which_ = 4, storage_ = {<boost::detail::aligned_storage::aligned_storage_imp<32, 8>> = {data_ = {
          buf = "\200Y\237WUU\000\000\310\324\177\237\377\177\000\000\320\b\001\\\377\177\000\000Hm\000\270\376\177\000",
          align_ = {<No data fields>}}}, static size = <optimized out>, static alignment = <optimized out>}}}
(gdb) print frameNS
$2 = {px = 0x5555579f5980}
(gdb) print frame
$3 = (icinga::ScriptFrame &) @0x7fff9f7fcf90: {Locals = {px = 0x7ffeb8007210}, Self = {m_Value = {which_ = 4,
      storage_ = {<boost::detail::aligned_storage::aligned_storage_imp<32, 8>> = {data_ = {
            buf = "\200Y\237WUU\000\000\310\324\177\237\377\177\000\000\320\b\001\\\377\177\000\000Hm\000\270\376\177\000",
            align_ = {<No data fields>}}}, static size = <optimized out>, static alignment = <optimized out>}}}, Sandboxed = false, Depth = 0,
  static m_ScriptFrames = {cleanup = {px = 0x5555579f58f0, pn = {pi_ = 0x5555579f5910}}}}
(gdb) print &frame.Self
$4 = (icinga::Value *) 0x7fff9f7fcf98
(gdb) print &frameNS
$5 = (icinga::Namespace::Ptr *) 0x7fff9f7fc990
(gdb) info f
Stack level 15, frame at 0x7fff9f7fca70:
 rip = 0x555556a927f2 in icinga::FilterUtility::EvaluateFilter (/root/icinga2/lib/remote/filterutility.cpp:123); saved rip = 0x555556a92a94
 called by frame at 0x7fff9f7fcab0, caller of frame at 0x7fff9f7fc960
 source language c++.
 Arglist at 0x7fff9f7fca60, args: frame=..., filter=0x7ffeb8007600, target=..., variableName=...
 Locals at 0x7fff9f7fca60, Previous frame's sp is 0x7fff9f7fca70
 Saved registers:
  rbx at 0x7fff9f7fca50, rbp at 0x7fff9f7fca60, r12 at 0x7fff9f7fca58, rip at 0x7fff9f7fca68
(gdb) info locals
field = {ID = 2, TypeName = 0x555557145926 "String", Name = 0x5555571456f0 "check_command", NavigationName = 0x5555571456f0 "check_command",
  RefTypeName = 0x55555714592d "CheckCommand", Attributes = 770, ArrayRank = 0}
joinedObj = {px = 0x7fffd0010280}
fid = 21
type = {px = 0x555557a38590}
varName = {static NPos = 18446744073709551615, m_Data = "service"}
frameNS = {px = 0x5555579f5980}

@Elias481
Copy link
Contributor

Elias481 commented Mar 3, 2019

I assume this is related to same root causes as #6785 .

  1. The Global Namespace is used here for both ScriptFrames (permissions and filters). Currently it is used for permissions for all requests which causes the problems. For filters it is only used when there are no filters which might be sufficent. (Anyway it does not make sense to allocate an empty unused namespace for filter in case there is no filter, not shure whether compiler will optimize this away properly).
    This leads to 2. or possibly the errors reported in API user permissions sometimes not applied #6785 as the checks store values with same key in same namespace (FilterUtility::EvaluateFilter).
    This is easy to fix in
    ScriptFrame permissionFrame(true);
    ... Just write soemthing like ScriptFrame permissionFrame(true, new Namespace()); there.
  2. It looks like the Namespace Objects are not completely Thread-Safe which can cause such double-free attempts that lead to Segfaults reported here and in API user permissions sometimes not applied #6785 . The use of __sync_sub_and_fetch for reference counting should normally avoid double-frees but if at the same time a __sync_add_and_fetch is queued fro another thread and runs in between of this operations the ref count can drop to zero multiple times, causing delete to be triggered multiple times. An in general all non-read-only access to shared variables should be protected somehow which is not the case for the Namespace-Values themselves but only for the list of the values. So it's just not thread safe enough for threaded non-readonly access with multiple writers that do not only add new values but also change them (thought write only access could work without visible error).

@Al2Klimov
Copy link
Member

Hello @hrak,

we need all of them. Which OS(es), how does the zone tree look like, all of your monitoring objects (per zone), all users (you seem to have already postet that) and the queries you're firing against the API.

Best,
AK

@Al2Klimov Al2Klimov self-assigned this Mar 11, 2019
@Al2Klimov Al2Klimov added the needs feedback We'll only proceed once we hear from you again label Mar 11, 2019
@Al2Klimov
Copy link
Member

Al2Klimov commented Mar 11, 2019

If there's a lot to set up, I also accept a Dockerfile/docker-compose.yml (I scrolled over your public projects) which accepts self-built Icinga 2 packages.

@dnsmichi dnsmichi added the needs feedback We'll only proceed once we hear from you again label Apr 25, 2019
@dnsmichi
Copy link
Contributor

ref/NC/602611

@dnsmichi
Copy link
Contributor

ref/IP/13853

@marcofl
Copy link

marcofl commented Apr 30, 2019

We're currently testing the snapshot 2.10.4+626 and are experiencing strange issues with API permission filters which basically block us from testing the snapshot version against our actual use-case (querying a lot of objects and creating downtimes via API):

object Host "simplehost_vshn" {
  import "generic-host"
  vars.categories = [ "openshift", ]
  display_name = "Servicehost: simplehost_vshn"
  check_command = "dummy"
}
object HostGroup "category_openshift"  {
  assign where host.vars.categories &&
host.vars.categories.contains("openshift")
}
object ApiUser "blah"  {
  password = "welcome"
  permissions = [
  {
    permission = "objects/query/*"
    filter = {{ host.groups && "category_openshift" in host.groups }}
  },
  {
    permission = "actions/schedule-downtime"
    filter = {{ host.groups && "category_openshift" in host.groups }}
  },
  {
    permission = "actions/remove-downtime"
    filter = {{ host.groups && "category_openshift" in host.groups }}
  }, ]
}

with 2.10.4 we get the host object(s) which match the filter
but with v2.10.4-626-g26df2cc4b:

# curl -k -H 'Accept: application/json' 'https://blah:welcome@localhost:5665/v1/objects/hosts?verbose=1&pretty=1'
{
    "diagnostic_information": "Error: Error while evaluating expression: Tried to access undefined script variable 'host'\nLocation: in /etc/icinga2/objects/apiusers.conf: 13:17-13:20\n/etc/icinga2/objects/apiusers.conf(11):   {\n/etc/icinga2/objects/apiusers.conf(12):     permission = \"objects/query/*\"\n/etc/icinga2/objects/apiusers.conf(13):     filter = {{ host.groups && \"category_openshift\" in host.groups }}\n                                                        ^^^^\n/etc/icinga2/objects/apiusers.conf(14):   },\n/etc/icinga2/objects/apiusers.conf(15):   {\n",
    "error": 404.0,
    "status": "No objects found."
}

I looks like the host variable does not exist in the scope of the filters.

@dnsmichi
Copy link
Contributor

I'm already debugging it.

@dnsmichi dnsmichi self-assigned this Apr 30, 2019
@dnsmichi
Copy link
Contributor

[2019-04-30 17:12:50 +0200] information/ApiListener: New client connection from [::1]:64857 (no client certificate)
[2019-04-30 17:12:50 +0200] information/HttpServerConnection: Request: GET /v1/objects/hosts (from [::1]:64857), user: blah, agent: curl/7.54.0).
[2019-04-30 17:12:50 +0200] warning/FilterUtility: Updating frame NS with 'obj': Object of type 'Host' and 'host': Object of type 'Host'
[2019-04-30 17:12:50 +0200] warning/FilterUtility: Frame NS: {"check_command":{"type":"CheckCommand","version":0.0},"check_period":null,"command_endpoint":null,"event_command":null,"host":{"acknowledgement":0.0,"acknowledgement_expiry":0.0,"check_attempt":1.0,"flapping":false,"flapping_buffer":0.0,"flapping_current":0.0,"flapping_index":5.0,"flapping_last_change":0.0,"force_next_check":false,"force_next_notification":false,"last_check_result":{"active":true,"check_source":"mbpmif.int.netways.de","command":null,"execution_end":1556637167.309014,"execution_start":1556637167.309014,"exit_status":0.0,"output":"Check was successful.","performance_data":[],"schedule_end":1556637167.309031,"schedule_start":1556637167.3071418,"state":0.0,"ttl":0.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0},"vars_before":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0}},"last_hard_state_change":1556279557.633899,"last_hard_state_raw":0.0,"last_reachable":true,"last_state_change":1556279557.633899,"last_state_down":0.0,"last_state_raw":0.0,"last_state_type":1.0,"last_state_unreachable":0.0,"last_state_up":1556637167.309057,"next_check":1556637466.659129,"previous_state_change":1556279557.633899,"state_raw":0.0,"state_type":1.0,"type":"Host","version":0.0},"obj":{"acknowledgement":0.0,"acknowledgement_expiry":0.0,"check_attempt":1.0,"flapping":false,"flapping_buffer":0.0,"flapping_current":0.0,"flapping_index":5.0,"flapping_last_change":0.0,"force_next_check":false,"force_next_notification":false,"last_check_result":{"active":true,"check_source":"mbpmif.int.netways.de","command":null,"execution_end":1556637167.309014,"execution_start":1556637167.309014,"exit_status":0.0,"output":"Check was successful.","performance_data":[],"schedule_end":1556637167.309031,"schedule_start":1556637167.3071418,"state":0.0,"ttl":0.0,"type":"CheckResult","vars_after":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0},"vars_before":{"attempt":1.0,"reachable":true,"state":0.0,"state_type":1.0}},"last_hard_state_change":1556279557.633899,"last_hard_state_raw":0.0,"last_reachable":true,"last_state_change":1556279557.633899,"last_state_down":0.0,"last_state_raw":0.0,"last_state_type":1.0,"last_state_unreachable":0.0,"last_state_up":1556637167.309057,"next_check":1556637466.659129,"previous_state_change":1556279557.633899,"state_raw":0.0,"state_type":1.0,"type":"Host","version":0.0}}
[2019-04-30 17:12:50 +0200] information/HttpServerConnection: HTTP client disconnected (from [::1]:64857)

host is set but filter->Evaluate() is unable to detect this via frame.Self and VariableExpression::Evaluate().

@Elias481
Copy link
Contributor

Elias481 commented May 2, 2019

Aah I see I should have tested the patch a bit more deeply.
So reason for that is, that the permissions function runs in LocalScope but locals are not used here.
So when changing that to ThisScope it works again. (The function then get the prepared Self namespace for execution instead of the empty Locals dictionary of the permissions frame.)
Before the patch "use dedicated namespace for permissions frame" the VariableExpression evaluated from within the function could find the host with the fall-back to GlobalScope implemented in VariableExpression::DoEvaluate.
So I attach You another PR to fix this. But I'm not 100% certain about possible implications of this as I'm still missing the big-picture.

@dnsmichi
Copy link
Contributor

dnsmichi commented May 3, 2019

Thanks again. I'm not yet convinced that only the this scope is required here, will update later when I've had a peek into the code again.

@dnsmichi
Copy link
Contributor

dnsmichi commented May 3, 2019

Here's my analysis, the PR is sane and puts everything into the current scope wherever needed.

#7155 (comment)

Snapshot packages will be available during the night.

@dnsmichi dnsmichi pinned this issue May 6, 2019
@hrak
Copy link

hrak commented May 8, 2019

I have just tested the snapshot packages 2.10.4+649.g736e0806d.2019.05.07+1.bionic-0_amd64, and they seem to fix the crashing issue and the issue where we would sometimes see services for a different team on our dashboards (as in: filters not being applied correctly).

I had to roll back to 2.9.2 again however, because triggering a deployment in director would hang up the master API. The master process keeps doing all its other tasks it seems, but calls to the master API hang indefinitely (f.e. clicking on 'deployments' in icingaweb would result in a gateway timeout). I tested a curl to the master like curl -s -X GET -H 'Accept: application/json' -k -u 'ApiRoot':'password' 'https://master01.mon01.example.com:5665/v1/config/packages', waited for minutes without response.

@dnsmichi
Copy link
Contributor

dnsmichi commented May 8, 2019

Thanks for noticing, I see that too after pulling the snapshot packages in Vagrant. Will investigate, I have an idea already.

@dnsmichi
Copy link
Contributor

dnsmichi commented May 8, 2019

[root@icinga2 ~]# debuginfo-install boost169-chrono-1.69.0-1.el7.x86_64 boost169-context-1.69.0-1.el7.x86_64 boost169-coroutine-1.69.0-1.el7.x86_64 boost169-date-time-1.69.0-1.el7.x86_64 boost169-filesystem-1.69.0-1.el7.x86_64 boost169-program-options-1.69.0-1.el7.x86_64 boost169-regex-1.69.0-1.el7.x86_64 boost169-system-1.69.0-1.el7.x86_64 boost169-thread-1.69.0-1.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64 glibc-2.17-260.el7_6.3.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-34.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libedit-3.0-12.20121213cvs.el7.x86_64 libgcc-4.8.5-36.el7_6.1.x86_64 libgcrypt-1.5.3-14.el7.x86_64 libgpg-error-1.12-3.el7.x86_64 libicu-50.1.2-17.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libstdc++-4.8.5-36.el7_6.1.x86_64 lz4-1.7.5-2.el7.x86_64 ncurses-libs-5.9-14.20130511.el7_4.x86_64 openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-62.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64

[root@icinga2 ~]# pidof icinga2
22813 22792
[root@icinga2 ~]# gdb -p 22792

(gdb) info thr
  Id   Target Id         Frame
  19   Thread 0x7fb1a2f58700 (LWP 22809) "icinga2" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  18   Thread 0x7fb1a2f99700 (LWP 22810) "icinga2" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  17   Thread 0x7fb1a2fda700 (LWP 22811) "icinga2" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  16   Thread 0x7fb1a2f17700 (LWP 22812) "icinga2" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  15   Thread 0x7fb1a2ed6700 (LWP 22814) "icinga2" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  14   Thread 0x7fb1a2e95700 (LWP 22819) "icinga2" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  13   Thread 0x7fb1a2e54700 (LWP 22820) "icinga2" 0x00007fb19ffaf96a in timerfd_settime () at ../sysdeps/unix/syscall-template.S:81
  12   Thread 0x7fb19aadc700 (LWP 22821) "icinga2" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  11   Thread 0x7fb19aa9b700 (LWP 22822) "icinga2" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  10   Thread 0x7fb19aa5a700 (LWP 22823) "icinga2" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
  9    Thread 0x7fb19aa19700 (LWP 22824) "icinga2" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
  8    Thread 0x7fb19a9d8700 (LWP 22825) "icinga2" __strcmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp-sse42.S:1721
  7    Thread 0x7fb19a997700 (LWP 22826) "icinga2" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  6    Thread 0x7fb19a956700 (LWP 22827) "icinga2" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  5    Thread 0x7fb19a915700 (LWP 22829) "icinga2" 0x00007fb19ffa420d in poll () at ../sysdeps/unix/syscall-template.S:81
  4    Thread 0x7fb19a8d4700 (LWP 22830) "icinga2" 0x00007fb19ffa420d in poll () at ../sysdeps/unix/syscall-template.S:81
  3    Thread 0x7fb19a893700 (LWP 22831) "icinga2" 0x00007fb19ffa420d in poll () at ../sysdeps/unix/syscall-template.S:81
  2    Thread 0x7fb19a852700 (LWP 22832) "icinga2" 0x00007fb19ffa420d in poll () at ../sysdeps/unix/syscall-template.S:81
* 1    Thread 0x7fb1a2fdc8c0 (LWP 22792) "icinga2" 0x00007fb19ff75e2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81


(gdb) thread apply all bt

Thread 19 (Thread 0x7fb1a2f58700 (LWP 22809)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fb1a0287dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fb1a0287c98 in __GI___pthread_mutex_lock (mutex=0x1024320 <_ZZN6icinga20ConfigPackageUtility14GetStaticMutexEvE5mutex.372537.42964>) at ../nptl/pthread_mutex_lock.c:78
#3  0x00000000008a066f in pthread_mutex_lock (m=<optimized out>) at /usr/include/boost169/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:57
#4  lock (this=<optimized out>) at /usr/include/boost169/boost/thread/pthread/mutex.hpp:67
#5  boost::unique_lock<boost::mutex>::lock() [clone .local.21613] (this=this@entry=0x7fb1a2f57110) at /usr/include/boost169/boost/thread/lock_types.hpp:346
#6  0x00000000009b139f in __base_ctor (this=0x7fb1a2f57110, m_=...) at /usr/include/boost169/boost/thread/lock_types.hpp:124
#7  icinga::ConfigPackageUtility::SetActiveStageToFile (packageName=..., stageName=...) at ../remote/configpackageutility.cpp:274
#8  0x000000000097121b in icinga::ConfigPackageUtility::SetActiveStage (packageName=..., stageName=...) at ../remote/configpackageutility.cpp:321
#9  0x0000000000971792 in icinga::ConfigPackageUtility::TryActivateStageCallback (pr=..., packageName=..., stageName=..., reload=<optimized out>) at ../remote/configpackageutility.cpp:169
#10 0x000000000090efd3 in bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}::operator()() const [clone .local.19729] ()
    at ../base/threadpool.hpp:57
#11 0x0000000000c0478e in asio_handler_invoke (function=<optimized out>) at /usr/include/boost169/boost/asio/handler_invoke_hook.hpp:69
#12 invoke (function=<optimized out>, context=<optimized out>) at /usr/include/boost169/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 _ZNK5boost4asio15system_executor8dispatchIZN6icinga10ThreadPool4PostISt8functionIFvvEEEEbT_NS3_15SchedulerPolicyEEUlvE_SaIvEEEvOS9_RKT0_.isra.3590 (
    f=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x1849ff4>) at /usr/include/boost169/boost/asio/impl/system_executor.hpp:39
#14 operator() (this=0x7fb1a2f57c10) at /usr/include/boost169/boost/asio/detail/work_dispatcher.hpp:58
#15 asio_handler_invoke (function=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>) at /usr/include/boost169/boost/asio/handler_invoke_hook.hpp:69
#16 invoke (function=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>,
    context=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>) at /usr/include/boost169/boost/asio/detail/handler_invoke_helpers.hpp:37
#17 boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::allocator<void>*, boost::system::error_code const&, unsigned long) (owner=0x204edc0, base=<optimized out>)
    at /usr/include/boost169/boost/asio/detail/executor_op.hpp:70
#18 0x0000000000638631 in boost::asio::detail::scheduler::run(boost::system::error_code&) [clone .local.25264] (this=0x204edc0, ec=...)
    at /usr/include/boost169/boost/asio/detail/scheduler_operation.hpp:40
#19 0x0000000000638982 in boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() [clone .local.25262] (this=0x204eee0)
    at /usr/include/boost169/boost/asio/impl/thread_pool.ipp:33
#20 0x00000000007fe0bf in boost::asio::detail::boost_asio_detail_posix_thread_function (arg=0x204eee0) at /usr/include/boost169/boost/asio/detail/impl/posix_thread.ipp:74
#21 0x00007fb1a0285dd5 in start_thread (arg=0x7fb1a2f58700) at pthread_create.c:307
#22 0x00007fb19ffaeead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 17 (Thread 0x7fb1a2fda700 (LWP 22811)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
---Type <return> to continue, or q <return> to quit---
#1  0x00007fb1a0287dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fb1a0287c98 in __GI___pthread_mutex_lock (mutex=0x1024320 <_ZZN6icinga20ConfigPackageUtility14GetStaticMutexEvE5mutex.372537.42964>) at ../nptl/pthread_mutex_lock.c:78
#3  0x00000000008a066f in pthread_mutex_lock (m=<optimized out>) at /usr/include/boost169/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:57
#4  lock (this=<optimized out>) at /usr/include/boost169/boost/thread/pthread/mutex.hpp:67
#5  boost::unique_lock<boost::mutex>::lock() [clone .local.21613] (this=0x7fb1a2fd9260) at /usr/include/boost169/boost/thread/lock_types.hpp:346
#6  0x0000000000961c2c in icinga::ConfigPackageUtility::GetActiveStageFromFile (packageName=...) at /usr/include/boost169/boost/thread/lock_types.hpp:124
#7  0x0000000000971dad in icinga::ApiListener::CheckApiPackageIntegrity (this=0x7fb1940041a0) at ../remote/apilistener.cpp:1578
#8  0x0000000000783b5c in boost::signals2::detail::signal_impl<void (icinga::Timer const* const&), boost::signals2::optional_last_value<void>, int, std::less<int>, boost::function<void (icinga::Timer const* const&)>, boost::function<void (boost::signals2::connection const&, icinga::Timer const* const&)>, boost::signals2::mutex>::operator()(icinga::Timer const* const&) [clone .local.19174] (
    this=0x20550c0, args#0=<optimized out>) at /usr/include/boost169/boost/function/function_template.hpp:764
#9  0x0000000000b71338 in icinga::Timer::Call (this=0x250ff00) at /usr/include/boost169/boost/signals2/detail/signal_template.hpp:722
#10 0x000000000090efd3 in bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}::operator()() const [clone .local.19729] ()
    at ../base/threadpool.hpp:57
#11 0x0000000000c0478e in asio_handler_invoke (function=<optimized out>) at /usr/include/boost169/boost/asio/handler_invoke_hook.hpp:69
#12 invoke (function=<optimized out>, context=<optimized out>) at /usr/include/boost169/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 _ZNK5boost4asio15system_executor8dispatchIZN6icinga10ThreadPool4PostISt8functionIFvvEEEEbT_NS3_15SchedulerPolicyEEUlvE_SaIvEEEvOS9_RKT0_.isra.3590 (
    f=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x1849ff4>) at /usr/include/boost169/boost/asio/impl/system_executor.hpp:39
#14 operator() (this=0x7fb1a2fd9c10) at /usr/include/boost169/boost/asio/detail/work_dispatcher.hpp:58
#15 asio_handler_invoke (function=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>) at /usr/include/boost169/boost/asio/handler_invoke_hook.hpp:69
#16 invoke (function=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>,
    context=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x182f4fa, DIE 0x184a746>) at /usr/include/boost169/boost/asio/detail/handler_invoke_helpers.hpp:37
#17 boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<bool icinga::ThreadPool::Post<std::function<void ()> >(std::function<void ()>, icinga::SchedulerPolicy)::{lambda()#1}>, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, std::allocator<void>*, boost::system::error_code const&, unsigned long) (owner=0x204edc0, base=<optimized out>)
    at /usr/include/boost169/boost/asio/detail/executor_op.hpp:70
#18 0x0000000000638631 in boost::asio::detail::scheduler::run(boost::system::error_code&) [clone .local.25264] (this=0x204edc0, ec=...)
    at /usr/include/boost169/boost/asio/detail/scheduler_operation.hpp:40
#19 0x0000000000638982 in boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() [clone .local.25262] (this=0x204ef60)
    at /usr/include/boost169/boost/asio/impl/thread_pool.ipp:33
#20 0x00000000007fe0bf in boost::asio::detail::boost_asio_detail_posix_thread_function (arg=0x204ef60) at /usr/include/boost169/boost/asio/detail/impl/posix_thread.ipp:74
#21 0x00007fb1a0285dd5 in start_thread (arg=0x7fb1a2fda700) at pthread_create.c:307
#22 0x00007fb19ffaeead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Thread 12 (Thread 0x7fb19aadc700 (LWP 22821)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
---Type <return> to continue, or q <return> to quit---
#1  0x00007fb1a0287dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fb1a0287c98 in __GI___pthread_mutex_lock (mutex=0x1024320 <_ZZN6icinga20ConfigPackageUtility14GetStaticMutexEvE5mutex.372537.42964>) at ../nptl/pthread_mutex_lock.c:78
#3  0x00000000008a066f in pthread_mutex_lock (m=<optimized out>) at /usr/include/boost169/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:57
#4  lock (this=<optimized out>) at /usr/include/boost169/boost/thread/pthread/mutex.hpp:67
#5  boost::unique_lock<boost::mutex>::lock() [clone .local.21613] (this=0x7fb1780c10e0) at /usr/include/boost169/boost/thread/lock_types.hpp:346
#6  0x000000000096aee4 in icinga::ConfigPackagesHandler::HandleGet (this=<optimized out>, user=..., request=..., url=..., response=..., params=...)
    at /usr/include/boost169/boost/thread/lock_types.hpp:124
#7  0x00000000009a9012 in icinga::ConfigPackagesHandler::HandleRequest (this=<optimized out>, stream=..., user=..., request=..., url=..., response=..., params=..., yc=..., server=..., params=...,
    response=..., url=..., request=..., user=..., this=<optimized out>) at ../remote/configpackageshandler.cpp:30
#8  0x00000000009654c3 in icinga::HttpHandler::ProcessRequest (stream=..., user=..., request=..., response=..., yc=..., server=...) at ../remote/httphandler.cpp:102
#9  0x0000000000b27edd in ProcessRequest (yc=..., hasStartedStreaming=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x2087, DIE 0x20a1>, server=..., response=...,
    authenticatedUser=..., request=..., stream=...) at ../remote/httpserverconnection.cpp:416
#10 icinga::HttpServerConnection::ProcessMessages(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >) (this=0x7fb178056230, yc=...)
    at ../remote/httpserverconnection.cpp:514
#11 0x0000000000b2a263 in _ZZN6icinga20HttpServerConnection5StartEvENKUlN5boost4asio19basic_yield_contextINS2_15executor_binderIPFvvENS2_8executorEEEEEE_clES9_.isra.4791 (
    yc=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at ../remote/httpserverconnection.cpp:56
#12 operator() (ca=..., this=<optimized out>) at /usr/include/boost169/boost/asio/impl/spawn.hpp:382
#13 run (this=0x7fb1780c2140) at /usr/include/boost169/boost/coroutine/detail/push_coroutine_object.hpp:293
#14 boost::coroutines::detail::trampoline_push_void<boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > >(boost::context::detail::transfer_t) [clone .324392] (t=...)
    at /usr/include/boost169/boost/coroutine/detail/trampoline_push.hpp:70
#15 0x00007fb1a29d318f in make_fcontext () at make_x86_64_sysv_elf_gas.S:71
#16 0x0000000000c82b30 in vtable for boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > [clone .305817] ()
#17 0x00007fb100000026 in ?? ()
#18 0x0000000000000000 in ?? ()

Thread 11 (Thread 0x7fb19aa9b700 (LWP 22822)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fb1a0287dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fb1a0287c98 in __GI___pthread_mutex_lock (mutex=0x1024320 <_ZZN6icinga20ConfigPackageUtility14GetStaticMutexEvE5mutex.372537.42964>) at ../nptl/pthread_mutex_lock.c:78
#3  0x00000000008a066f in pthread_mutex_lock (m=<optimized out>) at /usr/include/boost169/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:57
#4  lock (this=<optimized out>) at /usr/include/boost169/boost/thread/pthread/mutex.hpp:67
#5  boost::unique_lock<boost::mutex>::lock() [clone .local.21613] (this=0x7fb15403cfa0) at /usr/include/boost169/boost/thread/lock_types.hpp:346
#6  0x000000000096aee4 in icinga::ConfigPackagesHandler::HandleGet (this=<optimized out>, user=..., request=..., url=..., response=..., params=...)
    at /usr/include/boost169/boost/thread/lock_types.hpp:124
#7  0x00000000009a9012 in icinga::ConfigPackagesHandler::HandleRequest (this=<optimized out>, stream=..., user=..., request=..., url=..., response=..., params=..., yc=..., server=..., params=...,
    response=..., url=..., request=..., user=..., this=<optimized out>) at ../remote/configpackageshandler.cpp:30
#8  0x00000000009654c3 in icinga::HttpHandler::ProcessRequest (stream=..., user=..., request=..., response=..., yc=..., server=...) at ../remote/httphandler.cpp:102
#9  0x0000000000b27edd in ProcessRequest (yc=..., hasStartedStreaming=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x2087, DIE 0x20a1>, server=..., response=...,
    authenticatedUser=..., request=..., stream=...) at ../remote/httpserverconnection.cpp:416
#10 icinga::HttpServerConnection::ProcessMessages(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >) (this=0x7fb1540067f0, yc=...)
    at ../remote/httpserverconnection.cpp:514
---Type <return> to continue, or q <return> to quit---
#11 0x0000000000b2a263 in _ZZN6icinga20HttpServerConnection5StartEvENKUlN5boost4asio19basic_yield_contextINS2_15executor_binderIPFvvENS2_8executorEEEEEE_clES9_.isra.4791 (
    yc=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at ../remote/httpserverconnection.cpp:56
#12 operator() (ca=..., this=<optimized out>) at /usr/include/boost169/boost/asio/impl/spawn.hpp:382
#13 run (this=0x7fb15403e000) at /usr/include/boost169/boost/coroutine/detail/push_coroutine_object.hpp:293
#14 boost::coroutines::detail::trampoline_push_void<boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > >(boost::context::detail::transfer_t) [clone .324392] (t=...)
    at /usr/include/boost169/boost/coroutine/detail/trampoline_push.hpp:70
#15 0x00007fb1a29d318f in make_fcontext () at make_x86_64_sysv_elf_gas.S:71
#16 0x0000000000c82b30 in vtable for boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > [clone .305817] ()
#17 0x0000000000000026 in ?? ()
#18 0x0000000000000000 in ?? ()

Thread 10 (Thread 0x7fb19aa5a700 (LWP 22823)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fb1a0287dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007fb1a0287c98 in __GI___pthread_mutex_lock (mutex=0x1024320 <_ZZN6icinga20ConfigPackageUtility14GetStaticMutexEvE5mutex.372537.42964>) at ../nptl/pthread_mutex_lock.c:78
#3  0x00000000008a066f in pthread_mutex_lock (m=<optimized out>) at /usr/include/boost169/boost/thread/pthread/pthread_mutex_scoped_lock.hpp:57
#4  lock (this=<optimized out>) at /usr/include/boost169/boost/thread/pthread/mutex.hpp:67
#5  boost::unique_lock<boost::mutex>::lock() [clone .local.21613] (this=0x7fb178020610) at /usr/include/boost169/boost/thread/lock_types.hpp:346
#6  0x000000000096aee4 in icinga::ConfigPackagesHandler::HandleGet (this=<optimized out>, user=..., request=..., url=..., response=..., params=...)
    at /usr/include/boost169/boost/thread/lock_types.hpp:124
#7  0x00000000009a9012 in icinga::ConfigPackagesHandler::HandleRequest (this=<optimized out>, stream=..., user=..., request=..., url=..., response=..., params=..., yc=..., server=..., params=...,
    response=..., url=..., request=..., user=..., this=<optimized out>) at ../remote/configpackageshandler.cpp:30
#8  0x00000000009654c3 in icinga::HttpHandler::ProcessRequest (stream=..., user=..., request=..., response=..., yc=..., server=...) at ../remote/httphandler.cpp:102
#9  0x0000000000b27edd in ProcessRequest (yc=..., hasStartedStreaming=<unknown type in /usr/lib/debug/usr/lib64/icinga2/sbin/icinga2.debug, CU 0x2087, DIE 0x20a1>, server=..., response=...,
    authenticatedUser=..., request=..., stream=...) at ../remote/httpserverconnection.cpp:416
#10 icinga::HttpServerConnection::ProcessMessages(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >) (this=0x7fb178080600, yc=...)
    at ../remote/httpserverconnection.cpp:514
#11 0x0000000000b2a263 in _ZZN6icinga20HttpServerConnection5StartEvENKUlN5boost4asio19basic_yield_contextINS2_15executor_binderIPFvvENS2_8executorEEEEEE_clES9_.isra.4791 (
    yc=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at ../remote/httpserverconnection.cpp:56
#12 operator() (ca=..., this=<optimized out>) at /usr/include/boost169/boost/asio/impl/spawn.hpp:382
#13 run (this=0x7fb178021670) at /usr/include/boost169/boost/coroutine/detail/push_coroutine_object.hpp:293
#14 boost::coroutines::detail::trampoline_push_void<boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > >(boost::context::detail::transfer_t) [clone .324392] (t=...)
    at /usr/include/boost169/boost/coroutine/detail/trampoline_push.hpp:70
#15 0x00007fb1a29d318f in make_fcontext () at make_x86_64_sysv_elf_gas.S:71
#16 0x0000000000c82b30 in vtable for boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<void>, void, boost::asio::detail::coro_entry_point<boost::asio::executor_binder<void (*)(), boost::asio::io_context::strand>, icinga::HttpServerConnection::Start()::{lambda(boost::asio::basic_yield_context<boost::asio::executor_binder<void (*)(), boost::asio::executor> >)#1}>&, boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits> > [clone .305817] ()
#17 0x00007fb100000026 in ?? ()
#18 0x0000000000000000 in ?? ()

-> Regression from #7150, I'll work on a PR fix.

@dnsmichi
Copy link
Contributor

dnsmichi commented May 8, 2019

Fixed it, @hrak you're in the same timezone and likely home already, so snapshot packages will be built in roughly 6h.

@hrak
Copy link

hrak commented May 9, 2019

Tested with v2.10.4-658-g81075088f, its looking good now! Eagerly awaiting 2.11 :)

@dnsmichi
Copy link
Contributor

I'm waiting for customer feedback, from my tests and yours I consider this being fixed.

@dnsmichi dnsmichi removed the needs feedback We'll only proceed once we hear from you again label May 16, 2019
@dnsmichi dnsmichi modified the milestones: 2.11.0, 2.10.5 May 16, 2019
@dnsmichi dnsmichi unpinned this issue May 16, 2019
@jottekop
Copy link
Author

Thank you for all the effort. looking forward to the release of 2.11

@dnsmichi
Copy link
Contributor

We decided to move this into 2.10.5 some minutes ago, unless @marcofl reports otherwise.

dnsmichi pushed a commit that referenced this issue May 16, 2019
…y to allow proper parallel execution

  * fixes issue #6785 where permission checks get wrong result because permissions checks are done within a shared namespaces without using only unique keys
  * mitigates issue #6874 where segmentation faults occur because of concurrent access to non threadsafe parts of namespace (a fix for thread safety of namespaces which would be an alternative approach to get rid of these segfaults is out of scope of this fix as 6785 needs to be fixed anyway and this is the straight-forwards) way to fix that
* do the same for eventqueue (not certain whether events can be processed in parallel but I expect it is the case)

(cherry picked from commit 1e7cd4a)
@Crunsher
Copy link
Contributor

Crunsher commented May 21, 2019

I was unable to reproduce a crash, my dev environment is probably not large enough. My test involved a few ApiUsers with host group based permissions and Hosts with >10 groups, some matching some not. With them simultaneously deleting, inserting and querying those Hosts via a few forked curls.

All tests done with debug builds.
ApiUsers with regex permission filters.

  1. Ran globals on a 2.10.4 Debug console -> A host was there
  2. Ran globals on a 2.10 support Debug console -> No host
  3. Connected with a remote console -> No issues either
  4. Deliberate requesting of a Host with a matching filter and a not matching one right after worked as expected (not).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api REST API blocker Blocks a release or needs immediate attention core/crash Shouldn't happen, requires attention ref/IP ref/NC
Projects
None yet
Development

No branches or pull requests

10 participants