-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPS: upgrade grpc to v1.68.0 #136278
base: master
Are you sure you want to change the base?
DEPS: upgrade grpc to v1.68.0 #136278
Conversation
142b2a1
to
27e5b9d
Compare
re: the The most self-contained way to see this is
These checks were introduced in grpc/grpc-go#7184. If we start n3 with
Now, due to mixed version constraints, we definitely have to bypass this check. We can't reliably inject an env var in our customer environments, and we can't mutate What I don't understand why the old release isn't getting this right. After all, we're using gRPC, and gRPC fills this in by default - both now and back in v1.56.3. The way this should work is as follows. Assume we're the "master" binary, i.e. cockroach with gRPC 1.56.3. Our gRPC server has its credentials defined here: if !rpcCtx.ContextOptions.Insecure {
tlsConfig, err := rpcCtx.GetServerTLSConfig()
if err != nil {
return nil, sii, err
}
grpcOpts = append(grpcOpts, grpc.Creds(credentials.NewTLS(tlsConfig)))
} which calls into this // NewTLS uses c to construct a TransportCredentials based on TLS.
func NewTLS(c *tls.Config) TransportCredentials {
tc := &tlsCreds{credinternal.CloneTLSConfig(c)}
tc.config.NextProtos = credinternal.AppendH2ToNextProtos(tc.config.NextProtos)
return tc
} note the I'll need to poke at this some more. |
CRDB at v1.68.0 fails to communicate with CRDB at v1.56.3 due to this check. This is strange, since CRDB uses gRPC throughout, and in both v1.56.3 and v1.68.0 uses `tls.NewConfig` which ensures that `NextProtos` always contains `h2`. gRPC introduced this check in grpc#7535. See: cockroachdb/cockroach#136278 (comment)
This ALPN thing is sorted out, see #136367. Seems benign - we just need to make sure we keep that check disabled for long enough to not have to interop with versions of CRDB that precede this PR. |
re: the memory blow-up seen here, I think I have a handle on this as well. var out mem.BufferSlice
_, err = io.Copy(mem.NewWriter(&out, pool), io.LimitReader(dcReader, int64(maxReceiveMessageSize)+1))
if err != nil { and if buf == nil {
size := 32 * 1024
if l, ok := src.(*LimitedReader); ok && int64(size) > l.N {
if l.N < 1 {
size = 1
} else {
size = int(l.N)
}
}
buf = make([]byte, size)
} There should be ways to avoid that, now that we're switching to a fork of gRPC already anyway, but I assume there'll be interest to fix this upstream, too. |
I added a patch (
|
This benchmarks simple unary requests across a gRPC server with CockroachDB-specific settings (snappy compression, etc).
Use an existing test service that also does streaming-streaming, which allows the benchmark to highlight the overhead of unary RPCs as implemented by gRPC.
These will fire after the gRPC bump in the subsequent commits.
from v1.56.3 to v1.68.0. TODO: changelog
Similar but different error: ``` link: package conflict error: google.golang.org/genproto/googleapis/cloud/location: package imports google.golang.org/genproto/googleapis/api/annotations was compiled with: @org_golang_google_genproto//googleapis/api/annotations:annotations but was linked with: @org_golang_google_genproto_googleapis_api//annotations:annotations ``` Compare with the original: ``` link: package conflict error: cloud.google.com/go/pubsub/apiv1/pubsubpb: package imports google.golang.org/genproto/googleapis/api/annotations was compiled with: @org_golang_google_genproto_googleapis_api//annotations:annotations but was linked with: @org_golang_google_genproto//googleapis/api/annotations:annotations ```
Error unchanged from last commit. I did remember to `./dev gen bazel`. ``` link: package conflict error: google.golang.org/genproto/googleapis/cloud/location: package imports google.golang.org/genproto/googleapis/api/annotations was compiled with: @org_golang_google_genproto//googleapis/api/annotations:annotations but was linked with: @org_golang_google_genproto_googleapis_api//annotations:annotations ```
…f68ea54 See googleapis/go-genproto#1015. Sadly grpc-gateway is incompatible with this version of `genproto`: ``` ERROR: no such package '@@org_golang_google_genproto//googleapis/api/httpbody': BUILD file not found in directory 'googleapis/api/httpbody' of external repository @@org_golang_google_genproto. Add a BUILD file to a directory to mark it as a package. ERROR: /private/var/tmp/_bazel_tbg/b1346cddcc70d57afdaa90f7f09f9b2c/external/com_github_grpc_ecosystem_grpc_gateway/runtime/BUILD.bazel:5:11: no such package '@@org_golang_google_genproto//googleapis/api/httpbody': BUILD file not found in directory 'googleapis/api/httpbody' of external repository @@org_golang_google_genproto. Add a BUILD file to a directory to mark it as a package. and referenced by '@@com_github_grpc_ecosystem_grpc_gateway//runtime:go_default_library' ``` Since we are on the final `grpc-gateway/v1` version already[^1], we'll have to make the leap to v2 to fix this. [^1]: https://github.com/grpc-ecosystem/grpc-gateway/releases/tag/v1.16.0
This also adds the bytestream and rpc packages from genproto which are required. Epic: none Release note: None
This works around the issue discussed in cockroachdb#136367. Apparently, we have some misconfiguration or bug in gRPC v1.56.3 that makes the gRPC server seem unable to properly support HTTP2. This effectively breaks communication between CRDB nodes at these two different gRPC versions. Switch to a fork that disables the check (there is no other way to disable it other than changing code).
Replacements unconditionally replace the dependency, so it's nice to point out that whatever the version in go.mod claims is not actually what is being used.
This is less efficient since it needs to scan the entire reader but our hand is forced by gRPC changing away from a byte slice.
See https://github.com/cockroachdb/grpc-go/releases/tag/v1.68.0-noalpncheck%2Bdecompsize: Re-instate (Decompressor).DecompressedSize optimization This is a) for parity with how gRPC v1.56.3 worked. But also it showed up as a giant regression in allocations, as we were now going through `io.Copy` which allocates a temporary 32KiB buffer. Our payloads are often much smaller.
cbe8f5b
to
1301656
Compare
TODO:
DecompressedSize
no longer being used (ref)From v1.56.3 to v1.68.0. Full commit list below1 created by script2.
The main benefit of the upgrade is that we can benefit from some great work they did to reuse memory (i.e. reduce memory pressure). Part of this is internal, but one significant new piece is exposed through the
CodecV2
interface, we currently implement currently aCodecV0
. They also moved to protov2 internally, so their defaultCodecV2
would be slow for us (due to the need to "dress up" the protov1 messages as v2, if it would work at all).Details
Here's where their default new codec transforms protov1 messages:
/encoding/proto/proto.go#L69-L70
gRPC internally operates on protoV2 messages. We would hand it v1 messages, meaning
messageV2Of
will call this code which wraps the message in an interface:I think there's reflection involved. This is probably not efficient. This conversion is new as of grpc/grpc-go#6965, shortly after gRPC moved to protoV2 internally.
Here's what CodecV2 looks like.
/encoding/encoding_v2.go#L31-L39
Note the
mem.BufferSlice
. So basically, whatever the implementation (that's our code) needs to allocate, it can put it into amem.BufferSlice
, which one can get like this:/encoding/proto/proto.go#L57-L58
then when gRPC is done with the buffer, it will release it back into the pool. This seems pretty nice, but it means that our actual proto unmarshaling code (rather, gogoprotos) would need to plumb down this buffer pool. In their default v2 codec, they run into this same issue with google-protobuf:
/encoding/proto/proto.go#L75-L80
Our codec is here (a legacy V1 codec):
/pkg/rpc/codec.go#L23-L42
Perf-related commits3 below.
mem
package to facilitate memory reuse grpc/grpc-go#7432RecvBufferPool
deactivation issues grpc/grpc-go#6766mem.BufferSlice
instead of[]byte
grpc/grpc-go#7356pretty.ToJSON
and move code outside of lock grpc/grpc-go#7132Closes #134971.
Epic: CRDB-43584
Footnotes
https://github.com/cockroachdb/cockroach/pull/136278#issuecomment-2503756022 ↩
https://gist.github.com/tbg/c518ba3844f94abf4fff826f13be5300 ↩
git log ^v1.56.3 v1.68.0 --grep pool --grep reuse --grep memory --grep perf --grep alloc --grep bench --oneline | sed -e 's/#/grpc\/grpc-go#/g' | sed -E -e 's~^([0-9a-f]+)~\[\1\](https://github.com/grpc/grpc-go/commit/\1)~' | sed -e 's/^/- /'
and also read the release notes for all intermediate releases. ↩