Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: migrate decoders to protobuf #63

Merged
merged 8 commits into from
Jun 18, 2023

Conversation

kruskall
Copy link
Member

@kruskall kruskall commented Jun 14, 2023

Trying out a new approach for faster decoding.

Related to #52
Blocked by #62

@kruskall kruskall force-pushed the feat/migrate-decoder-error-pb branch 3 times, most recently from 66ede1c to ef83129 Compare June 15, 2023 12:00
@kruskall
Copy link
Member Author

@axw before adding the finishing touch and marking this as ready for review, I wanted to hear your feedback on the new approach I tried out on decoding.

Instead of decoding into a struct and then mapping, we're decoding to a map[string]any directly and then mapping.

In practice this means going from:

	if from.Environment.IsSet() {
		out.Environment = from.Environment.Val
	}

to

	if environment, ok := getString(m, "environment"); ok {
		out.Environment = environment
	}

Nested values can be retrieved without increasing LOC or verbosity:

	if name, ok := getStringPaths(m, "node.configured_name"); ok {
		out.Node.Name = name
	}

We would lose the go struct representation but I think the switch would be worth it. WDYT ?

@axw
Copy link
Member

axw commented Jun 15, 2023

@kruskall we used to do this years ago. It had poor performance due to all the allocations it incurs. Moreover, the "input model" Go structs enforce schema validation. I don't want to go back to this.

@kruskall kruskall force-pushed the feat/migrate-decoder-error-pb branch from ef83129 to 4a4acc4 Compare June 16, 2023 17:44
@kruskall kruskall changed the title feat: migrate error decoder to protobuf feat: migrate decoders to protobuf Jun 16, 2023
@kruskall kruskall marked this pull request as ready for review June 16, 2023 17:55
@kruskall kruskall requested a review from a team June 16, 2023 17:55
@kruskall
Copy link
Member Author

Ah I see, sorry about that!

I pushed some changes. Should be good for review now!

@axw
Copy link
Member

axw commented Jun 17, 2023

@kruskall thanks. That's not to say that what we have is ideal, either. Perhaps after the protobuf work is done we can revisit this.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a handful of issues that I noticed.

I'm quite concerned about how easy it will be to introduce panics in the translation code. I think we should add some fuzz testing (#26) ASAP.

Comment on lines 49 to 50
MetricType modelpb.MetricType
CompressionStrategy modelpb.CompressionStrategy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MetricType modelpb.MetricType
CompressionStrategy modelpb.CompressionStrategy

The reflection code is a good backstop in case we miss some tests, but I think we should prefer to have specific tests, particularly for non-trivial mappings like CompressionStrategy where we need to do a text->enum. I suggest removing these fields (MetricType doesn't seem to be hit anyway?) and exclude the fields from reflection: 32d2cdb

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the fields from the reflection code, but still need to add an explicit test for the compression strategy translation. See 32d2cdb.

@kruskall
Copy link
Member Author

Thanks for the feedback! 🙇

I ran fuzz testing on the decoders and was able to catch a fair amount of issues. The work on fuzz testing depends on google/gofuzz#68 so we can't add it easily. If you want to test it, I had to clone the PR and add a replace directive but it's available here: https://github.com/kruskall/apm-data/tree/test/fuzz-testing

@kruskall kruskall requested a review from axw June 17, 2023 23:49
@axw
Copy link
Member

axw commented Jun 18, 2023

I ran fuzz testing on the decoders and was able to catch a fair amount of issues. The work on fuzz testing depends on google/gofuzz#68 so we can't add it easily. If you want to test it, I had to clone the PR and add a replace directive but it's available here: https://github.com/kruskall/apm-data/tree/test/fuzz-testing

I was thinking we would use package testing from the standard library. We could either pass the corpus through the JSON decoders, or we could interpret the corpus for setting the modeldecoder struct fields directly.

The latter approach should be most efficient, since we really just care about the translation part. I haven't used it before, but https://github.com/AdaLogics/go-fuzz-headers may be helpful.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please add a test for compression strategy

@kruskall
Copy link
Member Author

Thanks! I opened a followup issue to discuss fuzz testing.

@kruskall kruskall merged commit 0acd047 into elastic:main Jun 18, 2023
@kruskall kruskall deleted the feat/migrate-decoder-error-pb branch June 18, 2023 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants