-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protobuf: upgrade #8075
protobuf: upgrade #8075
Conversation
bf97600
to
b890a56
Compare
OK, this PR is now green and ready for review. Two notes on things in the original description:
I was wrong about this: when calling
I've been looking a lot at what it would take to remove all the "golden tests" in our CI suite that are comparing protobuf serialization output byte-per-byte. Spoiler: it's a lot. After speaking with @deepthi, here's how I think we should proceed on this:
This is all for now. Please start reviewing this PR at your leisure. :) |
I'm very excited for this. We've been blocked internally on grpc upgrades for a year because of this transitive etcd dependency. |
1be4aa3
to
f09ab33
Compare
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Description
Hiiii! This has been my work for this past week. We're trying to upgrade ProtoBuf and GRPC on Vitess.
Why?
Two good reasons:
protobuf
(the one atgithub.com/golang/protobuf
). The new APIv2 for ProtoBuf was released one year ago and although it has, huh, shortcomings, it's the law of the land and we need to eventually bite the bullet and upgrade eventually.What's breaking?
Good question! Lots of things. Let me make a list.
Easy breakages
protobuf
package has a new name altogether. Again this one is just a search and replace.protoc-gen-go
generator no longer has agrpc
plug-in. We're supposed to install a separateprotoc-gen-go-grpc
and run it explicitly to generate our GRPC files. I've fixed this in the Makefile.etcd
client is not backwards compatible with the new version of GRPC. Fortunately there's already a new release ofetcd
that supports the new GRPC. I've upgraded to the new release and fixed all the APIs (most notably, the SSL configuration API was removed, I've adjusted the code accordingly).UnimplementedService
struct which provides default methods for all the APIs (this provides forward compatibility when upgrading the.proto
s because the struct is autogenerated by the compiler). I've embedded it accordingly.Really bad shit
Protobuf messages cannot be passed by value: this is huge and the reason for about 60% of all the changes in this PR. The Protobuf messages that the new compiler generates have embedded metadata fields that cannot be copied. Why? Honestly I'm kind of unsure of the rationale, but our amici at Google have embedded a
[0]sync.Mutex
field on each generated message, which, whilst occupying no actual space on the generated code, will ensure that any linters orgo vet
screams loudly about copying a Mutex. So no more copying of Protobuf data structures.proto.Clone
must be used explicitly now -- I've adjusted the whole codebase accordingly.Textual serialization is no longer stable: this is the source of the other 40% of all changes in this PR. It is actually two different issues:
Serializing ProtoText and ProtoJSON randomly introduces whitespace in the output. I'm not even joking. It just changes indentation/spacing randomly so you cannot do textual comparison of the output. Why? Because fuck me, that's why. I assume this a forward looking contract so they can optimize serialization down the road by e.g. changing the ordering of the serialized fields, which would also be valid. But really it's mostly about fucking with me.
If you've ever looked at our test suite, Vitess developers are actually very fond of textual comparisons, so the random whitespace in the output really breaks things. For this draft version of the PR I've replaced the ProtoBuf package with a fork which is identical except it disables the randomness. Disabling the randomness just requires a call to a
Disable
method, but such method is in aninternal
package so it cannot be called from Vitess. Again, this is all a very elaborate prank directly targeting me, so it makes sense from that point of view. Once the build is fully green, I'll evaluate what to do with the randomness issue; maybe I'll send a PR upstream or maybe I'll send a letter, Kaczynski style. Let's find out.The textual debug output has also changed. This is what you'd see when you call
fmt.Sprintf("%v", protobufMessage)
, it's a debug string that cannot be parsed back, but it's used extensively throughout our test suite for comparing the results. The whitespace of these strings has changed, and also notably, the<
separator that was used for dictionaries has been changed to{
(e.g. before:some_field:<a:1 b:"foo">
, after:some_field:{a:1 b:"foo"}
). In theory this should be something that is fully scoped to the test suite, because Vitess itself doesn't depend on the debug-formatting output of Protobuf messages (and it really shouldn't, because again, it's not parseable), but I've noticed that thewrangler
package is actually inserting this kind of output into a column in MySQL. I need to research this further, my understanding is that this column is for debug/logging purposes only, but I need to make sure, otherwise this would be a backwards incompatible change.Protobuf messages cannot be compared deeply: this is actually not a breaking change. Protobuf messages have never been safe to compare directly, because they contained
XXX_
fields that can change and break the comparison. Despite this, our test suite was usingassert.Equals
andrequire.Equals
on Protobuf messages extensively, because these fields weren't populated most of the time, so the comparisons were OK. The new version of the Protobuf compiler has replaced theXXX_
fields with a new implementation that actually changes all the time, so comparing objects directly breaks our unit tests all over the place. I've painstakingly fixed hundreds of these calls by hand, and I've probably missed some. This is the part of the upgrade I'm still actively working on. 😪Conclusions
This PR is massive and it's going to conflict with pretty much every other PR that is merged on Vitess. I'm going to keep rebasing during the next week with the hope that I can get this fully green and merged before next Friday.
Wish me luck.
Related Issue(s)
Checklist
Deployment Notes