Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vendor: update gRPC, protobuf #2452

Merged
merged 4 commits into from
May 15, 2018
Merged

Conversation

thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Nov 17, 2017

This updates the vendored packages for gRPC, protobuf, and re-generates (make generates)

Full diffs of the updated dependencies:

NOTE: this PR is built on top of #2451, so needs to be rebased if that one is merged; I thought keeping them separate was cleaner, but we can close the other PR if combining them is okay 😄

relates to #2448, #2444

ping @stevvooe @aaronlehmann @dperny @anshulpundir PTAL

@thaJeztah
Copy link
Member Author

Looks like I have some work to do;

# github.com/docker/swarmkit/api
api/ca.pb.go:971: undefined: metadata.FromContext
api/ca.pb.go:979: undefined: metadata.NewContext
api/ca.pb.go:1113: undefined: metadata.FromContext
api/ca.pb.go:1121: undefined: metadata.NewContext
api/control.pb.go:5846: undefined: metadata.FromContext
api/control.pb.go:5854: undefined: metadata.NewContext
api/dispatcher.pb.go:1589: undefined: metadata.FromContext
api/dispatcher.pb.go:1597: undefined: metadata.NewContext
api/health.pb.go:273: undefined: metadata.FromContext
api/health.pb.go:281: undefined: metadata.NewContext
api/health.pb.go:281: too many errors

@thaJeztah
Copy link
Member Author

Due to this change; grpc/grpc-go@596a6ac#diff-b3a9f1c1b5d5600f4924efe942b13fb0 updating now

@thaJeztah thaJeztah force-pushed the update-protobuf-grpc branch 2 times, most recently from c130a68 to 23df0f5 Compare November 17, 2017 15:03
@stevvooe
Copy link
Contributor

LGTM when the tests pass.

Let me know if you need help hunting down the correct context functions.

@thaJeztah
Copy link
Member Author

I replaced for metadata.FromContext(ctx) with metadata.FromIncomingContext(ctx), and metadata.NewContext() with metadata.NewOutgoingContext() (based on grpc/grpc-go@596a6ac#diff-b3a9f1c1b5d5600f4924efe942b13fb0),
but reading the changelog, grpc/grpc-go#1157 added separation of "incoming" and "outgoing" contexts (previously they were merged), so likely have to change some yes, as CI is failing:

time="2017-11-17T15:15:20Z" level=info msg="dialing to target with scheme: \"\"" module=grpc 
time="2

I'll do some digging to see where the context is changed, so the other one should be used 👍

@codecov
Copy link

codecov bot commented Nov 21, 2017

Codecov Report

Merging #2452 into master will decrease coverage by 0.02%.
The diff coverage is 46.15%.

@@            Coverage Diff             @@
##           master    #2452      +/-   ##
==========================================
- Coverage   61.78%   61.76%   -0.03%     
==========================================
  Files         134      134              
  Lines       21829    21821       -8     
==========================================
- Hits        13488    13478      -10     
- Misses       6887     6899      +12     
+ Partials     1454     1444      -10

@nishanttotla
Copy link
Contributor

@thaJeztah are you still working on this?

@thaJeztah
Copy link
Member Author

I was looking this afternoon to see if I could find where changes are needed, but didn't spot it directly 😓

Current failure is:

ok  	github.com/docker/swarmkit/manager/state/raft	75.762s	coverage: 70.8% of statements
time="2017-11-21T14:08:30Z" level=info msg="dialing to target with scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="could not get resolver for scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="dialing to target with scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="could not get resolver for scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="dialing to target with scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="could not get resolver for scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="dialing to target with scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=info msg="could not get resolver for scheme: \"\"" module=grpc 
time="2017-11-21T14:08:30Z" level=warning msg="no certificate expiration specified, using default" cluster.id=ris4q0hasft4l7oz06y5zym7t method="(*Server).UpdateRootCA" module=ca testHasExternalCA=false 
--- FAIL: TestCanRemoveMember (23.14s)
	Error Trace:	cluster_test.go:353
	Error:		Received unexpected error context deadline exceeded
		

	Error Trace:	cluster_test.go:355
	Error:		Expected nil, but got: &api.RaftMember{RaftID:0xd0b32a9451660d6, NodeID:"s95109a4smlcinl8vkjjyolv4", Addr:"127.0.0.1:37953", Status:api.RaftMemberStatus{Leader:false, Reachability:1, Message:""}}
		

	Error Trace:	cluster_test.go:356
	Error:		Not equal: 3 (expected)
			        != 2 (actual)
		

FAIL
coverage: 76.0% of statements
FAIL	github.com/docker/swarmkit/manager/state/raft/membership	23.241s
make: *** [coverage] Error 1

if rb == nil {
// TODO(bar) return error when DNS becomes the default (implemented and
// registered by DNS package).
grpclog.Infof("could not get resolver for scheme: %q", target.Scheme)
Copy link
Member Author

@thaJeztah thaJeztah Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info messages come from here; they're just "info", so may not be related to CI failing (although missing DNS could probably cause the timeout)?

Locally, tests pass (but same warnings are printed);

screen shot 2017-11-21 at 20 33 38

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we're having these issues lately. I've restarted the CI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, now it's even worse; definitely possible it's related to this PR, but don't have a clue yet where the problem would be

@crosbymichael
Copy link
Contributor

Can we get some maintainers looking at this problem? We are trying to fix this but there are some larger issues that swarm it panicing over version bumps.

@dmcgowan
Copy link
Member

We need to get moby/moby#36895 into the next release, what can we do to help get this unblocked?

@nishanttotla
Copy link
Contributor

We'll do this on priority.

Restarting CI one more time to judge if the failure is due to flakiness, but will look into the diff to try and figure out.

@dperny
Copy link
Collaborator

dperny commented May 11, 2018

@nishanttotla it needs to be rebased, just in case.

@thaJeztah
Copy link
Member Author

rebased

@nishanttotla
Copy link
Contributor

@thaJeztah it seems like you'll have to remove vendor/ folder and re-run vndr once to fix the CI for the current error. I think this is a CI issue related to the latest version of vndr.

@thaJeztah
Copy link
Member Author

Hm, let me try

@thaJeztah thaJeztah force-pushed the update-protobuf-grpc branch 2 times, most recently from e3b3cb8 to 2a14d4a Compare May 11, 2018 21:36
s, _ := status.FromError(err)
assert.Equal(t, codes.Internal, s.Code())
assert.Equal(t, codes.ResourceExhausted, s.Code())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed along with the other changes ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so (it was a while back that I opened this PR);I think they changed these errors

@dperny
Copy link
Collaborator

dperny commented May 14, 2018

i'm pouring through the logs on these test failures and i'm pretty confused. it seems to me that something may have changed in the internals of grpc, but i can't figure out what it would have been.

@thaJeztah
Copy link
Member Author

I did notice there are more recent versions since I created this PR; would it be worth updating to a more recent version?

@dperny
Copy link
Collaborator

dperny commented May 14, 2018

I'm unsure. It's certainly worth trying, to see if it will quietly start passing again. I'm digging into the root cause right now.

@anshulpundir
Copy link
Contributor

anshulpundir commented May 14, 2018

agreed. it makes sense to try upgrading to the latest version @thaJeztah

@dperny
Copy link
Collaborator

dperny commented May 14, 2018

I figured it out, 1 sec, verifying the fix.

@dperny
Copy link
Collaborator

dperny commented May 14, 2018

@thaJeztah I pushed to your branch a fix, I hope you don't mind. The problem was that we were string matching an error message, but the message changed.

}

// additionally, the error we're looking for is an internal error
if grpcStatus.Code() != codes.Internal {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering; does it return a specific .Code() that could be useful for detection (in addition to string-matching)?

// MaxMsgSize returns a ServerOption to set the max message size in bytes for inbound mesages.
// If this is not set, gRPC uses the default 4MB.
// MaxMsgSize returns a ServerOption to set the max message size in bytes the server can receive.
// If this is not set, gRPC uses the default limit. Deprecated: use MaxRecvMsgSize instead.
func MaxMsgSize(m int) ServerOption {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dperny Just saw this was marked deprecated; perhaps you can update https://github.com/docker/swarmkit/blob/8aa9c33bcdff9ea38fc79e0b1d054199917513f3/manager/manager.go#L272, which is using the deprecated function

t.Close()
return nil, connectionErrorf(true, err, "transport: %v", err)
return nil, connectionErrorf(true, err, "transport: failed to write window update: %v", err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks indeed that ConnectionError doesn't have any additional information 😞

Copy link
Member Author

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Contributor

@anshulpundir anshulpundir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but a squash is needed.

@thaJeztah
Copy link
Member Author

but a squash is needed

commits looked sensibly grouped, but not a maintainer

@anshulpundir
Copy link
Contributor

anshulpundir commented May 15, 2018

  1. Update grpc, protobuf
  2. Regenerate protobuf
  3. Update tests for changes in gRPC
  4. Fix breakage resulting from error string matching
  5. Update deprecated grpc function

I think 4, 5 should be squashed into 1. I don't think these commits mean much individually.

thaJeztah and others added 4 commits May 15, 2018 17:27
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Fixes an issue resulting from string matching the error result of grpc
methods. Adds comments documenting the fix, as well as explaining how to
avoid this issue in the future.

Also updates manager/manager.go to use the function MaxRecvMsgSize, instead
of the deprecated equivalent MaxMsgSize.

Signed-off-by: Drew Erny <drew.erny@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah
Copy link
Member Author

rebased and squashed

@thaJeztah
Copy link
Member Author

Oh, I may have mis-interpreted; you want the changes into the same commit as the revendor? (in moby/moby we usually keep local changes separate from the vendor commit)

@anshulpundir
Copy link
Contributor

(in moby/moby we usually keep local changes separate from the vendor commit)

I guess that makes sense. Thanks for squashing the last two!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants