Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple handlers to support network plugins in swarm-mode #27287

Merged
merged 3 commits into from
Oct 20, 2016

Conversation

mavenugo
Copy link
Contributor

@mavenugo mavenugo commented Oct 11, 2016

Currently the plugins pkg allows a single handler. This assumption
breaks down if there are mutiple listeners to a plugin of a certain
Manifest such as NetworkDriver or IpamDriver when swarm-mode is enabled.

Now that we have libnetwork remote APIs supporting AllocateNetwork & FreeNetwork APIs which is required for swarm-mode support and moby/swarmkit#1607 is merged, this PR is a revisit of #24172.

Also vendoring swarmkit to bring in moby/swarmkit#1607 to support global network-plugin in swarm-mode

Signed-off-by: Madhu Venugopal madhu@docker.com

"strings"

"github.com/Sirupsen/logrus"
"github.com/docker/docker/api/errors"
"github.com/docker/docker/api/types"
"github.com/docker/docker/api/types/network"
clustertypes "github.com/docker/docker/daemon/cluster/provider"
"github.com/docker/docker/pkg/plugins"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should stop importing pkg/plugins since that's pluginv1 only. Use of pkg/plugingetter with an object that implements the PluginGetter interface such as daemon.pluginStore is the right way to get plugins v2 and fallback to v1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into how this import rule can be enforced at build time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. thanks @anusha-ragunathan . That also answers @mrjana's question on the expectations.

pluginList = append(pluginList, network.Type())
pluginMap[network.Type()] = true
}
remotes, _ := plugins.GetAll("NetworkDriver")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

daemon.pluginStore implements plugingetter. Use GetAllByCap()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

if !handled {
continue
}
handler(p.name, p.client)
for _, handler := range handlers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pluginv2 equivalent is in the plugin enable() -> CallHandler() path. https://github.com/docker/docker/blob/master/plugin/store/store_experimental.go. This along with the handler definition at https://github.com/docker/docker/blob/master/plugin/store/defs.go#L18 needs to be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PTAL.

@anusha-ragunathan
Copy link
Contributor

FAIL: docker_cli_external_volume_driver_unix_test.go:372: DockerExternalVolumeSuite.TestExternalVolumeDriverRetryNotImmediatelyExists

[d55622b6e8b5f] waiting for daemon to start
[d55622b6e8b5f] daemon started
docker_cli_external_volume_driver_unix_test.go:404:
    c.Assert(s.ec.activations, checker.Equals, 1)
... obtained int = 2
... expected int = 1

There's additional activation call to Plugin due to the added GetAll() in Handle and GetAllByCap() in GetNetworkDriverList.

@anusha-ragunathan
Copy link
Contributor

TestDockerNetworkConnectDisconnectToStoppedContainer panic:

goroutine 29283 [select, 5 minutes]:
net/http.(*persistConn).roundTrip(0xc4212b6200, 0xc4205d8be0, 0x0, 0x0, 0x0)
    /usr/local/go/src/net/http/transport.go:1840 +0x93b
net/http.(*Transport).RoundTrip(0xc4206acff0, 0xc4206ad2c0, 0xc4206acff0, 0x0, 0x0)
    /usr/local/go/src/net/http/transport.go:380 +0x4ee
net/http.send(0xc4206ad2c0, 0x197a920, 0xc4206acff0, 0x0, 0x0, 0x0, 0x8, 0xc420026d40, 0x410318)
    /usr/local/go/src/net/http/client.go:256 +0x15f
net/http.(*Client).send(0xc42070f580, 0xc4206ad2c0, 0x0, 0x0, 0x0, 0xc420026d40, 0x0, 0x1)
    /usr/local/go/src/net/http/client.go:146 +0x102
net/http.(*Client).doFollowingRedirects(0xc42070f580, 0xc4206ad2c0, 0x12badf8, 0x3, 0x197e501, 0xc420026040)
    /usr/local/go/src/net/http/client.go:528 +0x5e5
net/http.(*Client).Do(0xc42070f580, 0xc4206ad2c0, 0x0, 0x197e5a0, 0xc420026040)
    /usr/local/go/src/net/http/client.go:184 +0x1ea
github.com/docker/docker/integration-cli.(*Daemon).StartWithLogFile(0xc4200795c0, 0xc420026d00, 0x0, 0x0, 0x0, 0xc42070f6d8, 0x1)
    /go/src/github.com/docker/docker/integration-cli/daemon.go:229 +0xfb4
github.com/docker/docker/integration-cli.(*Daemon).Start(0xc4200795c0, 0x0, 0x0, 0x0, 0x2, 0xc42070f801)
    /go/src/github.com/docker/docker/integration-cli/daemon.go:141 +0x2ab
github.com/docker/docker/integration-cli.(*Daemon).Restart(0xc4200795c0, 0x0, 0x0, 0x0, 0xc420015640, 0xc42070f8f8)
    /go/src/github.com/docker/docker/integration-cli/daemon.go:363 +0xbd
github.com/docker/docker/integration-cli.(*DockerNetworkSuite).TestDockerNetworkConnectDisconnectToStoppedContainer(0xc42041c8a0, 0xc4206aca50)
    /go/src/github.com/docker/docker/integration-cli/docker_cli_network_unix_test.go:1252 +0x334
reflect.Value.call(0x11c24c0, 0xc42041c8a0, 0x1e13, 0x11d9701, 0x4, 0xc42070ff30, 0x1, 0x1, 0x19c9720, 0x11c0280, ...)
    /usr/local/go/src/reflect/value.go:434 +0x5c8
reflect.Value.Call(0x11c24c0, 0xc42041c8a0, 0x1e13, 0xc42070ff30, 0x1, 0x1, 0xc4206acb40, 0xc4206206cf, 0x4)
    /usr/local/go/src/reflect/value.go:302 +0xa4
github.com/go-check/check.(*suiteRunner).forkTest.func1(0xc4206aca50)
    /go/src/github.com/docker/docker/vendor/src/github.com/go-check/check/check.go:816 +0x6b8
github.com/go-check/check.(*suiteRunner).forkCall.func1(0xc424188bd0, 0xc4206aca50, 0xc423d83740)
    /go/src/github.com/docker/docker/vendor/src/github.com/go-check/check/check.go:672 +0x7c
created by github.com/go-check/check.(*suiteRunner).forkCall
    /go/src/github.com/docker/docker/vendor/src/github.com/go-check/check/check.go:673 +0x251

@anusha-ragunathan
Copy link
Contributor

goroutine hangs when the newly added GetAll() in Handle waits on activation. I was able to repro the TestDockerNetworkConnectDisconnectToStoppedContainer panic. Relevant backtrace (got from kill USR1 , not the jenkins log)

goroutine 1 [semacquire, 16 minutes]:
sync.runtime_notifyListWait(0xc420156510, 0x0)
    /usr/local/go/src/runtime/sema.go:267 +0x122    sync.(*Cond).Wait(0xc420156500)
    /usr/local/go/src/sync/cond.go:57 +0x80
github.com/docker/docker/pkg/plugins.(*Plugin).waitActive(0xc420158060, 0xc4205806c8, 0x410865)
    /go/src/github.com/docker/docker/pkg/plugins/plugins.go:151 +0x55
github.com/docker/docker/pkg/plugins.(*Plugin).implements(0xc420158060, 0x18245ae, 0xa, 0x1)
    /go/src/github.com/docker/docker/pkg/plugins/plugins.go:158 +0x2f
github.com/docker/docker/pkg/plugins.GetAll(0x18245ae, 0xa, 0xc420580820, 0xc420580830, 0x1, 0xc42015e000, 0x0)
    /go/src/github.com/docker/docker/pkg/plugins/plugins.go:292 +0x392
github.com/docker/docker/pkg/plugins.Handle(0x18245ae, 0xa, 0xc42040c060)
    /go/src/github.com/docker/docker/pkg/plugins/plugins.go:249 +0x139
github.com/docker/docker/plugin/store.(*Store).Handle(0xc420441800, 0x18245ae, 0xa, 0xc42040c060)
    /go/src/github.com/docker/docker/plugin/store/store.go:32 +0x3f
github.com/docker/docker/plugin/getter.(PluginGetter).Handle-fm(0x18245ae, 0xa, 0xc42040c060)
    /go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/ipams/remote/remote.go:37 +0x4d
github.com/docker/libnetwork/ipams/remote.Init(0x2272c20, 0xc42022fa80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/ipams/remote/remote.go:52 +0xec
github.com/docker/libnetwork.initIPAMDrivers(0xc42022fa80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/drivers_ipam.go:17 +0xcd
github.com/docker/libnetwork.New(0xc42022f900, 0x7, 0x8, 0xc420441800, 0xc4204bd9b0, 0xc42022f900, 0x7)
    /go/src/github.com/docker/docker/vendor/src/github.com/docker/libnetwork/controller.go:204 +0x6cc
github.com/docker/docker/daemon.(*Daemon).initNetworkController(0xc42023a8c0, 0xc420436380, 0xc4204bd9b0, 0x0, 0xc4204bd9b0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/daemon_unix.go:621 +0xa7
github.com/docker/docker/daemon.(*Daemon).restore(0xc42023a8c0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/daemon.go:252 +0xc97
github.com/docker/docker/daemon.NewDaemon(0xc420436380, 0x227f8a0, 0xc42015e7a0, 0x22727e0, 0xc42042eea0, 0x0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/daemon.go:685 +0x2202
main.(*DaemonCli).start(0xc420411b60, 0x0, 0x183c33b, 0x17, 0xc420436380, 0xc420369f90, 0xc420241200, 0x0, 0x0)
    /go/src/github.com/docker/docker/cmd/dockerd/daemon.go:246 +0xffd
main.runDaemon(0x0, 0x183c33b, 0x17, 0xc420436380, 0xc420369f90, 0xc420241200, 0xe, 0x0)
    /go/src/github.com/docker/docker/cmd/dockerd/docker.go:75 +0xb2
main.newDaemonCommand.func1(0xc420406fc0, 0xc42018ad20, 0x0, 0xe, 0x0, 0x0)
    /go/src/github.com/docker/docker/cmd/dockerd/docker.go:41 +0x71
github.com/spf13/cobra.(*Command).execute(0xc420406fc0, 0xc42000c100, 0xe, 0xe, 0xc420406fc0, 0xc42000c100)
    /go/src/github.com/docker/docker/vendor/src/github.com/spf13/cobra/command.go:643 +0x26d
github.com/spf13/cobra.(*Command).ExecuteC(0xc420406fc0, 0x1648900, 0xc420369f01, 0xc420421880)
    /go/src/github.com/docker/docker/vendor/src/github.com/spf13/cobra/command.go:739 +0x377
github.com/spf13/cobra.(*Command).Execute(0xc420406fc0, 0xc420421880, 0xc42001a0b8)
    /go/src/github.com/docker/docker/vendor/src/github.com/spf13/cobra/command.go:692 +0x2b
main.main()
    /go/src/github.com/docker/docker/cmd/dockerd/docker.go:99 +0xe2

@anusha-ragunathan
Copy link
Contributor

@mavenugo : TestDockerNetworkConnectDisconnectToStoppedContainer hang is because GetAll() activates the plugin, but there's no Broadcast() to wake up the goroutine. Handle can simply access the already populated storage.plugins array instead.

@@ -235,8 +249,8 @@ func Handle(iface string, fn func(string, *Client)) {

        handlers = append(handlers, fn)
        extpointHandlers[iface] = handlers
-       pa, _ := GetAll(iface)
-       for _, p := range pa {
+
+       for _, p := range storage.plugins {
                p.activated = false
        }
 }

@anusha-ragunathan
Copy link
Contributor

GetNetworkDriverList still suffers from this problem due to call to GetAllByCap. We can potentially have a method that simply accesses storage.plugins.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
@mavenugo
Copy link
Contributor Author

mavenugo commented Oct 15, 2016

@anusha-ragunathan thanks for the pointers. I managed to resolve the issue without having to use the GetAll API. But I couldn't narrow down on exactly what is going on with GetAll & activate. Either we need a different API as you suggested : #27287 (comment) or resolve the funkiness around lazy loading. I think we should handle that independently.

Now that moby/swarmkit#1607 is merged, I realized that it will be good to bring in swarmkit along with this PR and add IT to validate the behavior introduced by this PR.

@mavenugo mavenugo changed the title Allow plugins to have multiple handlers Allow plugins to have multiple handlers to support network plugins in swarm-mode Oct 15, 2016
@mavenugo mavenugo changed the title Allow plugins to have multiple handlers to support network plugins in swarm-mode Allow multiple plugin handlers to support network plugins in swarm-mode Oct 15, 2016
@mavenugo mavenugo changed the title Allow multiple plugin handlers to support network plugins in swarm-mode Allow multiple handlers to support network plugins in swarm-mode Oct 15, 2016
@mavenugo
Copy link
Contributor Author

@mrjana @aaronlehmann can u PTAL @ the swarmkit vendoring and the IT for e2e global network plugin support in swarm-mode ?

@anusha-ragunathan as discussed, other failures are resolved now. I opened #27411 to add Plugin-v2 specific IT (which is unrelated to this PR).

@anusha-ragunathan
Copy link
Contributor

@mavenugo : What about resetting activation as discussed before? https://github.com/docker/docker/pull/24172/files#r69201630

Currently the plugins pkg allows a single handler. This assumption
breaks down if there are mutiple listeners to a plugin of a certain
Manifest such as NetworkDriver or IpamDriver when swarm-mode is enabled.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
As of moby/swarmkit#1607, swarmkit honors
global network plugins while allocating network resources.

This IT covers the e2e integration between libnetwork, swarmkit and
docker engine to support global network-plugins for swarm-mode

Signed-off-by: Madhu Venugopal <madhu@docker.com>
@mavenugo
Copy link
Contributor Author

@anusha-ragunathan yes. that is still applicable. I missed that case while updating the patch. It is fixed now. PTAL.

BTW, I did observe some funkiness with GetAllByCap() & GetAll() APIs. I will work with you offline to understand the intended behavior.

@anusha-ragunathan
Copy link
Contributor

LGTM

@thaJeztah
Copy link
Member

ping @vieux PTAL

@mavenugo
Copy link
Contributor Author

also @dongluochen fyi. this completes the global-scoped network plugin (v1) support for swarm-mode.
The plugin-v2 support for swarm-mode requires some work in swarmkit which I believe @anusha-ragunathan is looking at.

@vieux
Copy link
Contributor

vieux commented Oct 19, 2016

LGTM ping @aaronlehmann @aluzzardi no know issue with this swarmkit vendoring ?

handlers = append(handlers, fn)
extpointHandlers[iface] = handlers
for _, p := range storage.plugins {
p.activated = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why all the plugins should be deactivated here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Thanks.

@dongluochen
Copy link
Contributor

sgtm (not a maintainer)

@aaronlehmann
Copy link
Contributor

LGTM

Windows failure looks unrelated.

Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants