Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[remote-storage] Add healthcheck to grpc server #5461

Merged
merged 4 commits into from
May 19, 2024

Conversation

james-ryans
Copy link
Contributor

@james-ryans james-ryans commented May 19, 2024

Which problem is this PR solving?

  • Remote storage server is missing a gRPC health check, the current use case is for testing environment to check if the gRPC server is ready before run the tests.

Description of the changes

  • Add "google.golang.org/grpc/health/grpc_health_v1" to remote-storage gRPC server.
  • Ping health check for remote storage server before proceeding with tests
  • Add deduping of spans by ID when retrieved from storage (when storage is not idempotent)
  • Make testzap loggers include source file

How was this change tested?

Checklist

Sorry, something went wrong.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Signed-off-by: James Ryans <james.ryans2012@gmail.com>
@james-ryans james-ryans requested a review from a team as a code owner May 19, 2024 14:04
@james-ryans james-ryans requested a review from pavolloffay May 19, 2024 14:04
Copy link

codecov bot commented May 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.48%. Comparing base (d4134d5) to head (4adedea).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5461      +/-   ##
==========================================
+ Coverage   95.46%   95.48%   +0.01%     
==========================================
  Files         331      331              
  Lines       16112    16121       +9     
==========================================
+ Hits        15382    15393      +11     
+ Misses        556      555       -1     
+ Partials      174      173       -1     
Flag Coverage Δ
badger_v1 8.07% <0.00%> (-0.01%) ⬇️
badger_v2 1.94% <0.00%> (-0.01%) ⬇️
cassandra-3.x-v1 16.47% <0.00%> (-0.02%) ⬇️
cassandra-3.x-v2 1.86% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1 16.47% <0.00%> (-0.02%) ⬇️
cassandra-4.x-v2 1.86% <0.00%> (-0.01%) ⬇️
elasticsearch-7.x 1.78% <0.00%> (+0.01%) ⬆️
elasticsearch-8.x 1.78% <0.00%> (-0.01%) ⬇️
grpc_v1 9.24% <100.00%> (+0.13%) ⬆️
grpc_v2 7.60% <100.00%> (?)
kafka 9.80% <0.00%> (-0.02%) ⬇️
opensearch-1.x 1.77% <0.00%> (-0.02%) ⬇️
opensearch-2.x 1.78% <0.00%> (+0.01%) ⬆️
unittests 93.95% <18.18%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does grpc automatically use this service? I was thinking that in the test we would need to loop-query it until we get a green response.

Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@@ -82,7 +82,7 @@ func (r *spanReader) GetTrace(ctx context.Context, traceID model.TraceID) (*mode
for i := range received.Spans {
spans = append(spans, &received.Spans[i])
}
r.logger.Info(fmt.Sprintf("GetTrace received %d spans (total %d)", len(received.Spans), len(spans)))
// r.logger.Info(fmt.Sprintf("GetTrace received %d spans (total %d)", len(received.Spans), len(spans)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@james-ryans are you sure we need this log? It's very noisy, I don't think it adds much value other than a form of progress reporting (which I would do with much lighter approach, maybe logging every 1000 spans)

Signed-off-by: Yuri Shkuro <github@ysh.us>
@yurishkuro yurishkuro merged commit 3dbd02b into jaegertracing:main May 19, 2024
38 checks passed
@james-ryans
Copy link
Contributor Author

Does grpc automatically use this service? I was thinking that in the test we would need to loop-query it until we get a green response.

No, It doesn't. You're right, the loop-query will do this. I've made changes to my local but you already helped with this, thank you!

@james-ryans are you sure we need this log? It's very noisy, I don't think it adds much value other than a form of progress reporting (which I would do with much lighter approach, maybe logging every 1000 spans)

Log the total retrieved spans would already be helpful, I never thought we would get 10 spans for each retrieval before. Thank you for the changes.

@yurishkuro
Copy link
Member

yurishkuro commented May 20, 2024

Might have introduced this race https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:16

Usually it's a sign of opportunistic shutdown where Close returns before bg routines exit and they continue calling the logger after the test was ended.

==================
WARNING: DATA RACE
Read at 0x00c0003ee043 by goroutine 92:
  testing.(*common).logDepth()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1017 +0xcd
  testing.(*common).log()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1010 +0xa4
  testing.(*common).Logf()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1061 +0x6a
  testing.(*T).Logf()
      <autogenerated>:1 +0x69
  go.uber.org/zap/zaptest.TestingWriter.Write()
      /home/runner/go/pkg/mod/go.uber.org/zap@v1.27.0/zaptest/logger.go:[14](https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:15)6 +0x11d
  go.uber.org/zap/zaptest.(*TestingWriter).Write()
      <autogenerated>:1 +0x74
  go.uber.org/zap/zapcore.(*ioCore).Write()
      /home/runner/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/core.go:99 +0x192
  go.uber.org/zap/zapcore.(*CheckedEntry).Write()
      /home/runner/go/pkg/mod/go.uber.org/zap@v1.27.0/zapcore/entry.go:253 +0x1ef
  go.uber.org/zap.(*Logger).Info()
      /home/runner/go/pkg/mod/go.uber.org/zap@v1.27.0/logger.go:247 +0x64
  github.com/jaegertracing/jaeger/cmd/agent/app.(*Agent).Run.func1()
      /home/runner/work/jaeger/jaeger/cmd/agent/app/agent.go:73 +0x3fc

Previous write at 0x00c0003ee043 by goroutine 81:
  testing.tRunner.func1()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1676 +0x8fa
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.22.3/x64/src/runtime/panic.go:602 +0x5d
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1742 +0x44

Goroutine 92 (running) created at:
  github.com/jaegertracing/jaeger/cmd/agent/app.(*Agent).Run()
      /home/runner/work/jaeger/jaeger/cmd/agent/app/agent.go:68 +0x1e8
  github.com/jaegertracing/jaeger/cmd/agent/app.TestStartStopRace()
      /home/runner/work/jaeger/jaeger/cmd/agent/app/agent_test.go:172 +0x611
  testing.tRunner()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1689 +0x21e
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1742 +0x44

Goroutine 81 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1742 +0x825
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:2161 +0x85
  testing.tRunner()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:1689 +0x21e
  testing.runTests()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:2[15](https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:16)9 +0x8be
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.22.3/x64/src/testing/testing.go:2027 +0xf[17](https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:18)
  go.uber.org/goleak.VerifyTestMain()
      /home/runner/go/pkg/mod/go.uber.org/goleak@v1.3.0/testmain.go:53 +0x64
  github.com/jaegertracing/jaeger/pkg/testutils.VerifyGoLeaks()
      /home/runner/work/jaeger/jaeger/pkg/testutils/leakcheck.go:40 +0x2[18](https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:19)
  github.com/jaegertracing/jaeger/cmd/agent/app.TestMain()
      /home/runner/work/jaeger/jaeger/cmd/agent/app/package_test.go:13 +0x2f5
  main.main()
      _testmain.go:107 +0x2f0
==================
FAIL
coverage: 1.6% of statements
FAIL	github.com/jaegertracing/jaeger/cmd/agent/app	[20](https://github.com/jaegertracing/jaeger/actions/runs/9154252555/job/25164498035?pr=5465#step:8:21).488s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants