Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC plugin crash is not terminating the process #3562

Closed
johanneswuerbach opened this issue Mar 4, 2022 · 0 comments · Fixed by #3604
Closed

GRPC plugin crash is not terminating the process #3562

johanneswuerbach opened this issue Mar 4, 2022 · 0 comments · Fixed by #3604
Labels

Comments

@johanneswuerbach
Copy link
Contributor

Describe the bug

When a GRPC plugin crashes due to failure the overall process is not terminated and instead every call to the service just fails.

To Reproduce
Steps to reproduce the behavior:

  1. Manually terminate a grpc plugin process
  2. See every interaction failing

Expected behavior

Either the plugin should be restarted or the process should be terminated so it can be restarted by another orchestration layer (k8s in our case)

Version (please complete the following information):

  • OS: Linux
  • Jaeger version: v1.30.0
  • Deployment: Kubernetes (using the jaeger-operator)

Logs:

{"level":"info","ts":1646302471.3807757,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to READY","system":"grpc","grpc_log":true}
2022-03-04T10:45:18.550Z [DEBUG] plugin process exited: path=/plugin/jaeger-s3 pid=12 error="signal: killed"
{"level":"info","ts":1646390718.550264,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to IDLE","system":"grpc","grpc_log":true}
2022-03-04T10:45:18.586Z [DEBUG] stdio: received EOF, stopping recv loop: err="rpc error: code = Canceled desc = context canceled"
{"level":"info","ts":1646390718.5868268,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0004d63b0, {IDLE <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390718.5868611,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to IDLE","system":"grpc","grpc_log":true}
{"level":"error","ts":1646390718.5869036,"caller":"app/http_handler.go:487","msg":"HTTP handler, Internal Server Error","error":"stream error: rpc error: code = Unavailable desc = error reading from server: EOF","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*APIHandler).handleError\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:487\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.(*APIHandler).search\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:236\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5\n\tgh.neting.cc/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:154\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/mux.(*Router).ServeHTTP\n\tgh.neting.cc/gorilla/mux@v1.8.0/mux.go:210\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.additionalHeadersHandler.func1\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/additional_headers_handler.go:28\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.CompressHandlerLevel.func1\n\tgh.neting.cc/gorilla/handlers@v1.5.1/compress.go:141\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.recoveryHandler.ServeHTTP\n\tgh.neting.cc/gorilla/handlers@v1.5.1/recovery.go:78\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2879\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}
{"level":"info","ts":1646390718.587493,"caller":"grpclog/component.go:71","msg":"[transport]transport: loopyWriter.run returning. connection error: desc = \"transport is closing\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.987736,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.987773,"caller":"channelz/logging.go:50","msg":"[core]Subchannel picks a new address \"unused\" to connect","system":"grpc","grpc_log":true}
{"level":"warn","ts":1646390726.9878578,"caller":"channelz/logging.go:75","msg":"[core]grpc: addrConn.createTransport failed to connect to {unused unused <nil> <nil> 0 <nil>}. Err: connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin3677101483: connect: connection refused\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.9878747,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.9878972,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0004d63b0, {CONNECTING <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.9879105,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.9879956,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0004d63b0, {TRANSIENT_FAILURE connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin3677101483: connect: connection refused\"}","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390726.9880102,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"error","ts":1646390726.9880514,"caller":"app/http_handler.go:487","msg":"HTTP handler, Internal Server Error","error":"plugin error: rpc error: code = Unavailable desc = connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin3677101483: connect: connection refused\"","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*APIHandler).handleError\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:487\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.(*APIHandler).getServices\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:158\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5\n\tgh.neting.cc/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:154\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/mux.(*Router).ServeHTTP\n\tgh.neting.cc/gorilla/mux@v1.8.0/mux.go:210\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.additionalHeadersHandler.func1\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/additional_headers_handler.go:28\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.CompressHandlerLevel.func1\n\tgh.neting.cc/gorilla/handlers@v1.5.1/compress.go:141\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.recoveryHandler.ServeHTTP\n\tgh.neting.cc/gorilla/handlers@v1.5.1/recovery.go:78\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2879\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}
{"level":"error","ts":1646390726.9885933,"caller":"app/http_handler.go:487","msg":"HTTP handler, Internal Server Error","error":"plugin error: rpc error: code = Unavailable desc = connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin3677101483: connect: connection refused\"","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*APIHandler).handleError\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:487\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.(*APIHandler).getOperationsLegacy\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/http_handler.go:180\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5\n\tgh.neting.cc/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:154\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/mux.(*Router).ServeHTTP\n\tgh.neting.cc/gorilla/mux@v1.8.0/mux.go:210\ngh.neting.cc/jaegertracing/jaeger/cmd/query/app.additionalHeadersHandler.func1\n\tgh.neting.cc/jaegertracing/jaeger/cmd/query/app/additional_headers_handler.go:28\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.CompressHandlerLevel.func1\n\tgh.neting.cc/gorilla/handlers@v1.5.1/compress.go:141\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2047\ngh.neting.cc/gorilla/handlers.recoveryHandler.ServeHTTP\n\tgh.neting.cc/gorilla/handlers@v1.5.1/recovery.go:78\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2879\nnet/http.(*conn).serve\n\tnet/http/server.go:1930"}
{"level":"info","ts":1646390727.988836,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to IDLE","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390727.9890022,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0004d63b0, {IDLE connection error: desc = \"transport: error while dialing: dial unix /tmp/plugin3677101483: connect: connection refused\"}","system":"grpc","grpc_log":true}
{"level":"info","ts":1646390727.9890687,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to IDLE","system":"grpc","grpc_log":true}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant