-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for extended clientside aggregation #1048
Conversation
7db3f89
to
c7c0fb6
Compare
} | ||
nameChunk := pipeSplitter.Chunk()[:startingColon] | ||
valueChunk := pipeSplitter.Chunk()[startingColon+1:] | ||
nameChunk := packet[:valueStart] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably use bytes.Cut
here to simplify the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is a good suggestion. I tried implementing this locally. It does make the code a fair bit easier to read, but it benchmarks worse than the baseline implementation on master. I'm leaning towards sticking with the bytes.IndexByte
implementation.
} | ||
typeChunk := pipeSplitter.Chunk() | ||
typeChunk := packet[typeStart+1 : tagsStart] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again bytes.Cut
might be cleaner than bytes.IndexByte
for this logic (but maybe not! if you don't think so please ignore)
case 'd', 'h': // consider DogStatsD's "distribution" to be a histogram | ||
ret.Type = "histogram" | ||
metric.Type = "histogram" | ||
case 'm': // We can ignore the s in "ms" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol
* Report allocs * Add support for multiple values in metric packet * Fix error message * Don't use SplitBytes * CHANGELOG.md entry * gofmt * Rerun tests * Rerun tests
Summary
This adds Veneur serverside support for statsd clients using extended clientside aggregation. This allows clients to pack multiple values into the same metrics packet - see the example from the docs:
Without extended clientside aggregation:
With extended clientside aggregation:
NB: clients still need to enable this manually; the Veneur server does not eg: negotiate a protocol version its clients to enable this, or anything fancy like that. This PR just adds support for parsing the packets produced by clients using option.
There are three main changes made by this PR:
ParseMetric
to take afunc(*UDPMetric)
callback called multiple times, instead of returning a single*UDPMetric
value. Why do it this way? The other alternatives I considered were to either return a slice, or to publish the results back to a channel. Returning a slice requires allocating a slice - I'd rather avoid the extra GC churn if we can, since the caller would just immediately discard the slice after iterating over it. Publishing to a channel is tempting, but also has some problems. The caller is just publishing to a channel anyway, so it seems like this should work - but we determine which channel to publish to based on theDigest
of the metric. Internalizing that routing logic into the parsing code seems like a layering violation. We could publish to an intermediate channel, but this requires an extra goroutine to shunt the metrics around. Overall, a callback is definitely a bit less ergonomic than the current pattern of directly returning a*UDPMetric
, but it seems like the best option available (and we can add some test helpers to improve the ergonomics in our tests).ParseMetric
code needs to be re-ordered a bit - instead of parsing thevalue
in essentially the order it which it arrives in the packet, now that there are multiple values, it's more convenient to parse them last, immediately before the metric is published via the callback function. This has changed the exact error message produced by some of the tests checking for invalid packets. This doesn't really matter, though, for two reasons: a) the error is just logged internally - there's no API contract around which error is returned and b) since this is a case where there's more than one thing wrong with the same packet, and it's just a question of which error we hit first, both errors are equally correct.ParseMetric
code to callbytes.IndexByte
directly, rather than using theSplitBytes
helper. This isn't strictly required for the rest of this change, but it's faster. Without removingSplitBytes
, there's a slight performance regression in the parsing code from adding the new multi-value logic. Net of the theSplitBytes
change, this PR improves the benchmark performance of that code. I'm not that attached to this part of the PR, and can unwind it if others are opposed to this part, but I wanted to "make room" from a performance standpoint for the rest of these changes. I did do some profiling ofSplitBytes
itself to see if there's anything obvious we can do to improve the performance of the helper, but didn't have much luck here.Motivation
Extended clientside aggregation avoids retransmitting the name & tag of the same metric reported multiple times; this can significantly increase the serialization performance on both the client and server, and reduces network traffic. I'd like to experiment with rolling out extended clientside aggregation to some services within Stripe, and this is the first step in doing so.
Test plan
I've added automated tests covering the new multi-value code. I've also stood up a local copy of the Veneur server & client, and verified that metrics published using extended clientside aggregation are correctly ingested by the server.