-
Notifications
You must be signed in to change notification settings - Fork 31
cmd/trace-agent: improve Agent.Process readability and add Span.SetMetric #490
Conversation
0b7375a
to
3a9c947
Compare
ff47d86
to
14de406
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Benjamin. This is super helpful to me because I plan to refactor this function soon, to separate concerns and have it more testable.
We have a code freeze on Friday for 6.6.0, so let's keep this here until next week, then we can merge it. Just so we get proper time to make sure everything's ok. I hope that works for you.
cmd/trace-agent/agent.go
Outdated
sampled = s.Add(pt) || sampled | ||
} | ||
// Don't go through sampling for < 0 priority traces | ||
if !hasPriority || priority < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here. If the trace doesn't have a user set sampling priority, wouldn't that mean it goes through the normal process of sampling? Shouldn't this simply be:
if priority < 0 {
// negative priority means reject
return
}
?
cmd/trace-agent/agent.go
Outdated
// We get the address of the struct holding the stats associated to no tags | ||
// TODO: get the real tagStats related to this trace payload. | ||
// Extract priority early, as later goroutines might manipulate the Metrics map in parallel which isn't safe. | ||
priority, hasPriority := root.Metrics[sampler.SamplingPriorityKey] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be nicer to move this above line 198 (if hasPriority {
). It's not used until then. Should be fine I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, will adjust!
@@ -274,37 +270,49 @@ func (a *Agent) Process(t model.Trace) { | |||
defer watchdog.LogOnPanic() | |||
// Everything is sent to concentrator for stats, regardless of sampling. | |||
a.Concentrator.Add(pt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this goroutine could be merged with the other one (service extractor). I have the feeling (but not the benchmarks to prove it) that it will be a fairly good perf. increase. If you're also unsure, just leave it, I'll write some benchmarks and try it out later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had the same thoughts working on that function. The service extractor is barely doing anything, so I'd guess that a goroutine is bit overdoing it. Although, I don't have any benchmark to power such argument.
Let's maybe investigate and do a PR independently?
2def2c3
to
d4a2cc7
Compare
Rebased for you and re-worded your commits a bit. |
d4a2cc7
to
fc0f945
Compare
fc0f945
to
4b8821b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks great!
// Extract the client sampling rate. | ||
clientSampleRate := sampler.GetTraceAppliedSampleRate(root) | ||
// Combine it with the pre-sampling rate. | ||
preSamplerRate := a.Receiver.PreSampler.Rate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this rate ever going to be different than 1? We’ve stopped making it configurable. I’d like to put it into a constant in a subsequent PR and simply use that. Would that be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we stopped making its initial value being configurable. But this value gets below 1 whenever the Agent uses too many resources -> it will drop a proportion of incoming payloads.
Note that I did split it into two variables (while we still only use its product, previously rate
) to make it more readable / because in a next PR I'll do something out of each individual value.
Agent.Process
by more meaningful units and add extra comments.Span.SetMetric
function and use it in relevant places.This prepare future changes involving that function and the usage of many more metrics.
It should have been two PRs (they are 2 commits) but as they are built on top of each other, I went for the easy way.