Skip to content
This repository has been archived by the owner on May 23, 2024. It is now read-only.

Report data loss stats to Jaeger backend #482

Merged
merged 14 commits into from
Jan 15, 2020

Conversation

yurishkuro
Copy link
Member

Which problem is this PR solving?

Short description of the changes

  • introduce stats reporting in the Batch

@yurishkuro yurishkuro requested a review from vprithvi January 6, 2020 23:58
@yurishkuro yurishkuro changed the title Client telemetry Report data loss stats to Jaeger backend Jan 7, 2020
zipkin_thrift_span.go Outdated Show resolved Hide resolved
reporter.go Outdated Show resolved Hide resolved
transport_udp.go Show resolved Hide resolved
transport_udp.go Outdated Show resolved Hide resolved
transport_udp.go Outdated Show resolved Hide resolved
transport_udp.go Outdated Show resolved Hide resolved
reporter.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jan 14, 2020

Codecov Report

Merging #482 into master will increase coverage by 0.14%.
The diff coverage is 97.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #482      +/-   ##
==========================================
+ Coverage   88.29%   88.43%   +0.14%     
==========================================
  Files          59       59              
  Lines        3553     3581      +28     
==========================================
+ Hits         3137     3167      +30     
+ Misses        304      303       -1     
+ Partials      112      111       -1
Impacted Files Coverage Δ
jaeger_thrift_span.go 100% <ø> (ø) ⬆️
zipkin_thrift_span.go 77.48% <ø> (ø) ⬆️
utils/udp_client.go 0% <0%> (ø) ⬆️
reporter.go 100% <100%> (ø) ⬆️
transport_udp.go 97.29% <100%> (+4.7%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8aaaade...50fd8a5. Read the comment docs.

Yuri Shkuro added 6 commits January 13, 2020 22:05
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Yuri Shkuro added 2 commits January 13, 2020 22:54
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Yuri Shkuro added 5 commits January 14, 2020 13:01
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
Signed-off-by: Yuri Shkuro <ys@uber.com>
reporter.go Outdated
@@ -214,10 +216,18 @@ func NewRemoteReporter(sender Transport, opts ...ReporterOption) Reporter {
sender: sender,
queue: make(chan reporterQueueItem, options.queueSize),
}
if receiver, ok := sender.(reporterstats.Receiver); ok {
receiver.SetReporterStats(reporter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little bit confusing to have remoteReporter implemented ReporterStats interface and use reporter as ReporterStats in the setter.

I think it might be better to have specific ReporterStats implementation and use it in the reporter, other reporter can reuse the stats too. Both queueLength and droppedCount can go to the Stats interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you're suggesting. Reporter is a provider of ReporterStats, which is why it passes itself to the receiver. queueLength is not a data loss metric and does not need to be made available to the transports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clearer if we have a specific provider of ReporterStats and pass it to reporter. Here reporter does too much stuff.

queueLength is not a data loss metric, but seems a reasonable reporter related metric and ReporterStats does not directly mean data loss stats.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I pulled reporterStats into a composable helper struct. But I don't want to change queueLength handling, it's not in scope of this change, and grouping it with ReporterStats provides no benefits.

Signed-off-by: Yuri Shkuro <ys@uber.com>
Copy link
Contributor

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +51 to +52
// The following counters are always non-negative, but we need to send them in signed i64 Thrift fields,
// so we keep them as signed. At 10k QPS, overflow happens in about 300 million years.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, let's hope Jaeger will still be around after 300 million years! :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and running without a restart

@yurishkuro yurishkuro merged commit e75ea75 into jaegertracing:master Jan 15, 2020
@yurishkuro yurishkuro deleted the client-telemetry branch January 15, 2020 16:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants