You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 1, 2021. It is now read-only.
I'm using Swarm with 3 Swarm agents and one separate master and etcd for cluster discovery.
I frequently create new overlay networks, run containers in the network, and remove the network after the containers were destroyed. Sometimes, I observe stray network interfaces that should have been removed, such as this one:
10: veth2401f79: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 46:ce:b7:56:55:f6 brd ff:ff:ff:ff:ff:ff
How to reproduce
I can reproduce the issue with the following script swarm-nettest.sh:
Before running the script, I had the following network interfaces on the Swarm agents:
$ ansible -i cluster.list ad-sim -a 'ip link'
ad-sim01 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:4c:30 brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:e7:22:c0:5a brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ca:f5:b0:8f brd ff:ff:ff:ff:ff:ff
6: veth84940a6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 0a:96:51:a4:06:ea brd ff:ff:ff:ff:ff:ff
ad-sim02 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:3f:cc brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:20:80:c3:0b brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ee:9f:1f:a4 brd ff:ff:ff:ff:ff:ff
6: veth2296a98: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 5e:97:19:2a:64:8b brd ff:ff:ff:ff:ff:ff
ad-sim03 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:4c:91 brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:f9:36:95:83 brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:0c:2c:bb:15 brd ff:ff:ff:ff:ff:ff
6: veth9fc6eaf: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 4a:0a:86:1c:06:de brd ff:ff:ff:ff:ff:ff
After running ./swarm-nettest.sh 10, I have the following network interfaces on the Swarm agents:
$ ansible -i cluster.list ad-sim -a 'ip link'
ad-sim01 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:4c:30 brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:e7:22:c0:5a brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ca:f5:b0:8f brd ff:ff:ff:ff:ff:ff
6: veth84940a6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 0a:96:51:a4:06:ea brd ff:ff:ff:ff:ff:ff
ad-sim02 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:3f:cc brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:20:80:c3:0b brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:ee:9f:1f:a4 brd ff:ff:ff:ff:ff:ff
6: veth2296a98: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 5e:97:19:2a:64:8b brd ff:ff:ff:ff:ff:ff
9: vethef60919: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 02:42:0a:00:02:02 brd ff:ff:ff:ff:ff:ff
10: veth2401f79: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 46:ce:b7:56:55:f6 brd ff:ff:ff:ff:ff:ff
ad-sim03 | SUCCESS | rc=0 >>
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:50:56:a0:4c:91 brd ff:ff:ff:ff:ff:ff
3: docker_gwbridge: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:f9:36:95:83 brd ff:ff:ff:ff:ff:ff
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
link/ether 02:42:0c:2c:bb:15 brd ff:ff:ff:ff:ff:ff
6: veth9fc6eaf: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default
link/ether 4a:0a:86:1c:06:de brd ff:ff:ff:ff:ff:ff
9: veth71b9c51: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 02:42:0a:00:03:02 brd ff:ff:ff:ff:ff:ff
10: vethaf86582: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
link/ether 9e:97:6b:cb:71:f8 brd ff:ff:ff:ff:ff:ff
As you can see, there are some additional veth* interfaces that are DOWN.
All containers and all overlay networks have been removed:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
cacea96e9632 swarm "/swarm join --addr=5" About an hour ago Up About an hour 2375/tcp ad-sim03/swarm
41e0cb36a871 swarm "/swarm join --addr=5" About an hour ago Up About an hour 2375/tcp ad-sim02/swarm
3e43affbeda1 swarm "/swarm join --addr=5" About an hour ago Up About an hour 2375/tcp ad-sim01/swarm
$ docker network ls
NETWORK ID NAME DRIVER
7a6cc1bfb574 ad-sim02/none null
cc4063de5fe6 ad-sim02/host host
a0d4c009773d ad-sim03/bridge bridge
4e13f2513e0c ad-sim03/none null
bd7469ecff2d ad-sim03/host host
76bdb767f678 ad-sim03/docker_gwbridge bridge
608e07808654 ad-sim02/bridge bridge
1540b7162de8 ad-sim02/docker_gwbridge bridge
d2e7ea54885a ad-sim01/docker_gwbridge bridge
8920890377ec ad-sim01/none null
a17c8b04a5a3 ad-sim01/host host
1ac2ce472b9a ad-sim01/bridge bridge
Interestingly, if I call create_network instead of run_unit in the script, the additional interfaces are all removed properly. This suggests that this is somehow related to attaching/detaching containers to the network.
If I repeat calling the script, I end up with hundreds of virtual network interfaces. At that point, it takes 10-20 seconds to create a new overlay network.
Additional information
$ ansible -i cluster.list all -a 'uname -a'
ad-sim01 | SUCCESS | rc=0 >>
Linux ad-sim01 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ci | SUCCESS | rc=0 >>
Linux ci 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ad-sim02 | SUCCESS | rc=0 >>
Linux ad-sim02 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ad-sim03 | SUCCESS | rc=0 >>
Linux ad-sim03 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ci is the Swarm master.
$ ansible -i cluster.list all -a 'docker -v'
ad-sim01 | SUCCESS | rc=0 >>
Docker version 1.10.3, build 20f81dd
ci | SUCCESS | rc=0 >>
Docker version 1.10.3, build 20f81dd
ad-sim02 | SUCCESS | rc=0 >>
Docker version 1.10.3, build 20f81dd
ad-sim03 | SUCCESS | rc=0 >>
Docker version 1.10.3, build 20f81dd
All nodes run swarm:latest (291cbe419fe6).
All nodes run with the following /etc/default/docker:
DOCKER_OPTS="
-H tcp://0.0.0.0:2375
-H unix:///var/run/docker.sock
--insecure-registry=ci:5000
--cluster-store=etcd://<ci_ip>:2379/
--cluster-advertise=eth0:2375
--dns <dns_ip>"
# If you need Docker to use an HTTP proxy, it can also be specified here.
export http_proxy="http://127.0.0.1:3128/"
export https_proxy="http://127.0.0.1:3128/"
export HTTP_PROXY="http://127.0.0.1:3128/"
export HTTPS_PROXY="http://127.0.0.1:3128/"
etcd also runs on ci with version 2.2.5 and the following config:
The issue
I'm using Swarm with 3 Swarm agents and one separate master and etcd for cluster discovery.
I frequently create new overlay networks, run containers in the network, and remove the network after the containers were destroyed. Sometimes, I observe stray network interfaces that should have been removed, such as this one:
How to reproduce
I can reproduce the issue with the following script
swarm-nettest.sh
:Before running the script, I had the following network interfaces on the Swarm agents:
After running
./swarm-nettest.sh 10
, I have the following network interfaces on the Swarm agents:As you can see, there are some additional
veth*
interfaces that areDOWN
.All containers and all overlay networks have been removed:
Interestingly, if I call
create_network
instead ofrun_unit
in the script, the additional interfaces are all removed properly. This suggests that this is somehow related to attaching/detaching containers to the network.If I repeat calling the script, I end up with hundreds of virtual network interfaces. At that point, it takes 10-20 seconds to create a new overlay network.
Additional information
ci
is the Swarm master.All nodes run
swarm:latest
(291cbe419fe6
).All nodes run with the following
/etc/default/docker
:etcd
also runs onci
with version 2.2.5 and the following config:The text was updated successfully, but these errors were encountered: