Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caught Segmentation fault #1064

Closed
hzariv opened this issue Jun 8, 2017 · 11 comments
Closed

Caught Segmentation fault #1064

hzariv opened this issue Jun 8, 2017 · 11 comments
Assignees
Labels
Milestone

Comments

@hzariv
Copy link

hzariv commented Jun 8, 2017

I am running docker image lyft/envoy:latest and getting the following crash using a config file is front proxy: docker run -p 8080:8080 -p 8001:8001 envoy-test
[2017-06-08 16:25:44.200][1][warning][main] initializing epoch 0 (hot restart version=8.2490552)
[2017-06-08 16:25:44.204][1][warning][main] all clusters initialized. initializing init manager
[2017-06-08 16:25:44.204][1][warning][main] all dependencies initialized. starting workers
[2017-06-08 16:25:44.204][1][warning][main] starting main dispatch loop
[2017-06-08 16:27:02.711][6][critical][backtrace] Caught Segmentation fault, suspect faulting address 0x0
[2017-06-08 16:27:02.714][6][critical][backtrace] Backtrace obj</usr/local/bin/envoy> thr<6> (use tools/stack_decode.py):
[2017-06-08 16:27:02.714][6][critical][backtrace] thr<6> #0 0x4b47a5
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #1 0x5f4c99
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #2 0x5f476d
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #3 0x5f94c7
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #4 0x5f5ff0
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #5 0x5f61f3
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #6 0x4aea77
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #7 0x708278
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #8 0x7064c8
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #9 0x706632
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #10 0x49a577
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #11 0x73a4a1
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #12 0x73abfe
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #13 0x494ac0
[2017-06-08 16:27:02.715][6][critical][backtrace] thr<6> #14 0x74434d
[2017-06-08 16:27:02.716][6][critical][backtrace] thr<6> obj</lib/x86_64-linux-gnu/libpthread.so.0>
[2017-06-08 16:27:02.716][6][critical][backtrace] thr<6> #15 0x7f8865b0b183
[2017-06-08 16:27:02.716][6][critical][backtrace] thr<6> obj</lib/x86_64-linux-gnu/libc.so.6>
[2017-06-08 16:27:02.716][6][critical][backtrace] thr<6> #16 0x7f8865838bec
[2017-06-08 16:27:02.716][6][critical][backtrace] end backtrace thread 6
This is on MacOS Sierra running docker version docker version
Client:
Version: 17.03.1-ce
API version: 1.24 (downgraded from 1.27)
Go version: go1.7.5
Git commit: c6d412e
Built: Tue Mar 28 00:40:02 2017
OS/Arch: darwin/amd64
Server:
Version: 1.12.0
API version: 1.24 (minimum version )
Go version: go1.6.3
Git commit: 8eab29e
Built: Thu Jul 28 23:54:00 2016
OS/Arch: linux/amd64
Experimental: false

Dockerfile:

FROM lyft/envoy:latest
ADD ./service-envoy.json /etc/service-envoy.json
ENTRYPOINT ["/usr/local/bin/envoy","-c", "/etc/service-envoy.json"]


Config file:
{
"listeners": [
{
"address": "tcp://0.0.0.0:8080",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {
"tracing": {
"operation_name": "ingress"
},
"codec_type": "auto",
"stat_prefix": "ingress_http",
"route_config": {
"virtual_hosts": [
{
"name": "service80",
"domains": [""],
"routes": [
{
"timeout_ms": 0,
"prefix": "/",
"cluster": "local_service80"
}
]
}
]
},
"filters": [
{
"type" : "decoder",
"name" : "fault",
"config" : {
"abort" :
{
"abort_percent" : 100,
"http_status" : 403
},
"headers" : [{"name" : "x-ebay-pes"}]
}
},
{
"type": "decoder",
"name": "router",
"config": {}
}
]
}
}
]
},
{
"address": "tcp://0.0.0.0:8081",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {
"tracing": {
"operation_name": "ingress"
},
"codec_type": "auto",
"stat_prefix": "ingress_http",
"route_config": {
"virtual_hosts": [
{
"name": "service81",
"domains": ["
"],
"routes": [
{
"timeout_ms": 0,
"prefix": "/",
"cluster": "local_service81"
}
]
}
]
},
"filters": [
{
"type": "decoder",
"name": "router",
"config": {}
}
]
}
}
]
},
{
"address": "tcp://0.0.0.0:8082",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {
"tracing": {
"operation_name": "ingress"
},
"codec_type": "auto",
"stat_prefix": "ingress_http",
"route_config": {
"virtual_hosts": [
{
"name": "service82",
"domains": [""],
"routes": [
{
"timeout_ms": 0,
"prefix": "/",
"cluster": "local_service82"
}
]
}
]
},
"filters": [
{
"type": "decoder",
"name": "router",
"config": {}
}
]
}
}
]
},
{
"address": "tcp://0.0.0.0:8083",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {
"tracing": {
"operation_name": "ingress"
},
"codec_type": "auto",
"stat_prefix": "ingress_http",
"route_config": {
"virtual_hosts": [
{
"name": "service83",
"domains": ["
"],
"routes": [
{
"timeout_ms": 0,
"prefix": "/",
"cluster": "local_service83"
}
]
}
]
},
"filters": [
{
"type": "decoder",
"name": "router",
"config": {}
}
]
}
}
]
}
],
"admin": {
"access_log_path": "/dev/null",
"address": "tcp://0.0.0.0:8001"
},
"cluster_manager": {
"clusters": [
{
"name": "local_service80",
"connect_timeout_ms": 250,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:9080"
}
]
},
{
"name": "local_service81",
"connect_timeout_ms": 250,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:9081"
}
]
},
{
"name": "local_service82",
"connect_timeout_ms": 250,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:9082"
}
]
},
{
"name": "local_service83",
"connect_timeout_ms": 250,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:9083"
}
]
}
]
}
}

@mattklein123
Copy link
Member

@hzariv this config file loads OK for me. Please get a stack trace w/ symbols either using GDB or the references decode stack python script.

@rshriram
Copy link
Member

rshriram commented Jun 8, 2017

also @hzariv can you try on a non-CE version of Docker? I have had seg faults on CE version.

@mattklein123 mattklein123 self-assigned this Jun 8, 2017
@mattklein123 mattklein123 added this to the 1.4.0 milestone Jun 8, 2017
@mattklein123
Copy link
Member

I can repro, looking.

@mattklein123
Copy link
Member

Likely regression from #932

(gdb) bt
#0  0x00000000006188fc in Envoy::Http::ConnectionManagerImpl::ActiveStream::decodeHeaders(std::unique_ptr<Envoy::Http::HeaderMap, std::default_delete<Envoy::Http::HeaderMap> >&&, bool) (this=0x1707e00, 
    headers=<unknown type in /home/mklein/.cache/bazel/_bazel_mklein/e891e82b12efccc4683c22d6a02939c1/execroot/envoy-private/bazel-out/local-dbg/bin/envoy-lyft/source/lyft/envoy, CU 0xca369d, DIE 0xcd6cc7>, end_stream=true) at external/envoy/source/common/http/conn_manager_impl.cc:480
#1  0x00000000007f1964 in Envoy::Http::Http1::ServerConnectionImpl::onMessageComplete (this=0x18741c0) at external/envoy/source/common/http/http1/codec_impl.cc:452
#2  0x00000000007f2c31 in Envoy::Http::Http1::ConnectionImpl::$_7::operator() (this=0x4800, parser=0x18741d8) at external/envoy/source/common/http/http1/codec_impl.cc:246
#3  0x00000000007f2c08 in Envoy::Http::Http1::ConnectionImpl::$_7::__invoke (parser=0x18741d8) at external/envoy/source/common/http/http1/codec_impl.cc:245
#4  0x00000000007fb68d in http_parser_execute (parser=0x18741d8, settings=0xe9f770 <Envoy::Http::Http1::ConnectionImpl::settings_>, data=<optimized out>, len=<optimized out>) at http_parser.c:2097
#5  0x00000000007f0732 in Envoy::Http::Http1::ConnectionImpl::dispatchSlice (this=0x18741c8, slice=0x18fa830 "GET / HTTP/1.1\r\nHost: 0:8080\r\nUser-Agent: curl/7.47.0\r\nAccept: */*\r\n\r\n", len=70) at external/envoy/source/common/http/http1/codec_impl.cc:301
#6  0x00000000007f0606 in Envoy::Http::Http1::ConnectionImpl::dispatch (this=0x18741c8, data=...) at external/envoy/source/common/http/http1/codec_impl.cc:290
#7  0x0000000000615cb4 in Envoy::Http::ConnectionManagerImpl::onData (this=0x18ec1a0, data=...) at external/envoy/source/common/http/conn_manager_impl.cc:191
#8  0x00000000009544e9 in Envoy::Network::FilterManagerImpl::onContinueReading (this=0x1850648, filter=0x0) at external/envoy/source/common/network/filter_manager_impl.cc:61
#9  0x000000000095459d in Envoy::Network::FilterManagerImpl::onRead (this=0x1850648) at external/envoy/source/common/network/filter_manager_impl.cc:71
#10 0x000000000094d8ef in Envoy::Network::ConnectionImpl::onRead (this=0x1850640, read_buffer_size=70) at external/envoy/source/common/network/connection_impl.cc:191
#11 0x000000000094df8f in Envoy::Network::ConnectionImpl::onReadReady (this=0x1850640) at external/envoy/source/common/network/connection_impl.cc:337
#12 0x000000000094dea4 in Envoy::Network::ConnectionImpl::onFileEvent (this=0x1850640, events=3) at external/envoy/source/common/network/connection_impl.cc:285
#13 0x000000000094f27e in Envoy::Network::ConnectionImpl::ConnectionImpl(Envoy::Event::DispatcherImpl&, int, std::shared_ptr<Envoy::Network::Address::Instance const>, std::shared_ptr<Envoy::Network::Address::Instance const>)::$_0::operator()(unsigned int) const (this=0x187b988, events=3)
    at external/envoy/source/common/network/connection_impl.cc:70
#14 0x000000000094f131 in std::_Function_handler<void (unsigned int), Envoy::Network::ConnectionImpl::ConnectionImpl(Envoy::Event::DispatcherImpl&, int, std::shared_ptr<Envoy::Network::Address::Instance const>, std::shared_ptr<Envoy::Network::Address::Instance const>)::$_0>::_M_invoke(std::_Any_data const&, unsigned int&&) (__functor=..., __args=<unknown type in /home/mklein/.cache/bazel/_bazel_mklein/e891e82b12efccc4683c22d6a02939c1/execroot/envoy-private/bazel-out/local-dbg/bin/envoy-lyft/source/lyft/envoy, CU 0x1d80386, DIE 0x1db34e4>)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.1/../../../../include/c++/5.4.1/functional:1871
#15 0x00000000005dc67a in std::function<void (unsigned int)>::operator()(unsigned int) const (this=0x187b988, __args=3) at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.1/../../../../include/c++/5.4.1/functional:2267
#16 0x00000000005dc015 in Envoy::Event::FileEventImpl::assignEvents(unsigned int)::$_0::operator()(int, short, void*) const (this=0x16ee4b0, what=38, arg=0x187b900) at external/envoy/source/common/event/file_event_impl.cc:60
#17 0x00000000005dbf38 in Envoy::Event::FileEventImpl::assignEvents(unsigned int)::$_0::__invoke(int, short, void*) (what=38, arg=0x187b900) at external/envoy/source/common/event/file_event_impl.cc:44
#18 0x00000000009a1d40 in event_persist_closure (base=<optimized out>, ev=<optimized out>) at event.c:1580
#19 event_process_active_single_queue (base=0x18b22c0, activeq=0x16ee4b0, max_to_process=2147483647, endtime=0x0) at event.c:1639
#20 0x000000000099ebbe in event_process_active (base=<optimized out>) at event.c:1741
#21 event_base_loop (base=0x18b22c0, flags=<optimized out>) at event.c:1961
#22 0x00000000005d8211 in Envoy::Event::DispatcherImpl::run (this=0x16e2160, type=Envoy::Event::Dispatcher::RunType::Block) at external/envoy/source/common/event/dispatcher_impl.cc:145
#23 0x00000000005c54c1 in Envoy::Worker::threadRoutine (this=0x17173c0, guard_dog=...) at external/envoy/source/server/worker.cc:65
#24 0x00000000005c5faf in Envoy::Worker::initializeConfiguration(Envoy::Server::Configuration::Main&, std::map<Envoy::Server::Configuration::Listener*, std::unique_ptr<Envoy::Network::TcpListenSocket, std::default_delete<Envoy::Network::TcpListenSocket> >, std::less<Envoy::Server::Configuration::Listener*>, std::allocator<std::pair<Envoy::Server::Configuration::Listener* const, std::unique_ptr<Envoy::Network::TcpListenSocket, std::default_delete<Envoy::Network::TcpListenSocket> > > > > const&, Envoy::Server::GuardDog&)::$_1::operator()() const (this=0x1937830)
    at external/envoy/source/server/worker.cc:44
#25 0x00000000005c5e5d in std::_Function_handler<void (), Envoy::Worker::initializeConfiguration(Envoy::Server::Configuration::Main&, std::map<Envoy::Server::Configuration::Listener*, std::unique_ptr<Envoy::Network::TcpListenSocket, std::default_delete<Envoy::Network::TcpListenSocket> >, std::less<Envoy::Server::Configuration::Listener*>, std::allocator<std::pair<Envoy::Server::Configuration::Listener* const, std::unique_ptr<Envoy::Network::TcpListenSocket, std::default_delete<Envoy::Network::TcpListenSocket> > > > > const&, Envoy::Server::GuardDog&)::$_1>::_M_invoke(std::_Any_data const&) (
    __functor=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.1/../../../../include/c++/5.4.1/functional:1871
#26 0x00000000005bc55e in std::function<void ()>::operator()() const (this=0x1937830) at /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.1/../../../../include/c++/5.4.1/functional:2267
#27 0x00000000009a99ec in Envoy::Thread::Thread::Thread(std::function<void ()>)::$_0::operator()(void*) const (this=0x7ffff687d700, arg=0x1937830) at external/envoy/source/common/common/thread.cc:15
#28 0x00000000009a99c8 in Envoy::Thread::Thread::Thread(std::function<void ()>)::$_0::__invoke(void*) (arg=0x1937830) at external/envoy/source/common/common/thread.cc:14
#29 0x00007ffff76b06ba in start_thread (arg=0x7ffff687d700) at pthread_create.c:333
#30 0x00007ffff73e682d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@mattklein123
Copy link
Member

@hzariv the main issue right now is that you have tracing configured on the HTTP listeners, but no tracing driver specified. I'm not sure if the crash is a regression or not. https://github.com/lyft/envoy/pull/1029/files will actually fix this crash. @goaway is on vacation right now and will finish when he gets back. @RomanDzhabarov can you potentially take a look at this just to make sure there is no larger issue here? I think we can wait for #1029 for the fix.

@RomanDzhabarov
Copy link
Member

Yup, the issue is that startSpan will return nullptr right now in case global (server) tracer is not configured.

Mike's PR was pretty close to be done, I'll take a look/fix/merge it.

The problem is well scoped and does not affect anything if there is a proper configuration.

@hzariv
Copy link
Author

hzariv commented Jun 8, 2017

@mattklein123 is there a workaround to unblock me? if not what is the ETA for merging the PR fix?

@mattklein123
Copy link
Member

Delete all tracing config like:

"tracing": {
                            "operation_name": "ingress"
                        }

It's a NOP

@hzariv
Copy link
Author

hzariv commented Jun 8, 2017

Thanks that fixed the crash but now I am getting:
upstream connect error or disconnect/reset before headers

Any hint how to debug this?

@hzariv
Copy link
Author

hzariv commented Jun 8, 2017

Never mind upstream connect error above. It is a docker networking issue :-(

@mattklein123
Copy link
Member

Should be fixed by #1029 with original config.

jpsim pushed a commit that referenced this issue Nov 28, 2022
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this issue Nov 29, 2022
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants