-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migrate decoders to protobuf #63
feat: migrate decoders to protobuf #63
Conversation
66ede1c
to
ef83129
Compare
@axw before adding the finishing touch and marking this as ready for review, I wanted to hear your feedback on the new approach I tried out on decoding. Instead of decoding into a struct and then mapping, we're decoding to a map[string]any directly and then mapping. In practice this means going from:
to
Nested values can be retrieved without increasing LOC or verbosity:
We would lose the go struct representation but I think the switch would be worth it. WDYT ? |
@kruskall we used to do this years ago. It had poor performance due to all the allocations it incurs. Moreover, the "input model" Go structs enforce schema validation. I don't want to go back to this. |
ef83129
to
4a4acc4
Compare
Ah I see, sorry about that! I pushed some changes. Should be good for review now! |
@kruskall thanks. That's not to say that what we have is ideal, either. Perhaps after the protobuf work is done we can revisit this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a handful of issues that I noticed.
I'm quite concerned about how easy it will be to introduce panics in the translation code. I think we should add some fuzz testing (#26) ASAP.
MetricType modelpb.MetricType | ||
CompressionStrategy modelpb.CompressionStrategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MetricType modelpb.MetricType | |
CompressionStrategy modelpb.CompressionStrategy |
The reflection code is a good backstop in case we miss some tests, but I think we should prefer to have specific tests, particularly for non-trivial mappings like CompressionStrategy where we need to do a text->enum. I suggest removing these fields (MetricType doesn't seem to be hit anyway?) and exclude the fields from reflection: 32d2cdb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed the fields from the reflection code, but still need to add an explicit test for the compression strategy translation. See 32d2cdb.
this was conflicting with fuzz testing
Thanks for the feedback! 🙇 I ran fuzz testing on the decoders and was able to catch a fair amount of issues. The work on fuzz testing depends on google/gofuzz#68 so we can't add it easily. If you want to test it, I had to clone the PR and add a replace directive but it's available here: https://github.com/kruskall/apm-data/tree/test/fuzz-testing |
I was thinking we would use package testing from the standard library. We could either pass the corpus through the JSON decoders, or we could interpret the corpus for setting the modeldecoder struct fields directly. The latter approach should be most efficient, since we really just care about the translation part. I haven't used it before, but https://github.com/AdaLogics/go-fuzz-headers may be helpful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please add a test for compression strategy
Thanks! I opened a followup issue to discuss fuzz testing. |
Trying out a new approach for faster decoding.
Related to #52
Blocked by #62