Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PortChannel is not fully cleared when teamd is stopped #6199

Closed
bingwang-ms opened this issue Dec 14, 2020 · 2 comments · Fixed by #6537
Closed

PortChannel is not fully cleared when teamd is stopped #6199

bingwang-ms opened this issue Dec 14, 2020 · 2 comments · Fixed by #6537
Assignees
Labels
Issue for 202012 Master Branch Quality P1 Priority of the issue, lower than P0

Comments

@bingwang-ms
Copy link
Contributor

Description
The issue is detected by test_po_cleanup. The test case failed consistently because PortChannel0023 still exists in system when teamd is stopped.

root@str2-dx010-acs-6:~# docker ps
CONTAINER ID        IMAGE                                COMMAND                  CREATED             STATUS              PORTS               NAMES
0ea4c76db094        docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   21 hours ago        Up 11 hours                             telemetry
3b0af78d0836        docker-sonic-mgmt-framework:latest   "/usr/local/bin/supe…"   21 hours ago        Up 13 hours                             mgmt-framework
5e9fcf98bbf7        docker-lldp:latest                   "/usr/bin/docker-lld…"   21 hours ago        Up 12 hours                             lldp
89a3e6ac7c79        docker-fpm-frr:latest                "/usr/bin/docker_ini…"   21 hours ago        Up 12 hours                             bgp
533534724f32        docker-platform-monitor:latest       "/usr/bin/docker_ini…"   21 hours ago        Up 12 hours                             pmon
ed7461159a59        docker-database:latest               "/usr/local/bin/dock…"   21 hours ago        Up 13 hours                             database

root@str2-dx010-acs-6:~# ip link show | grep Por
313: PortChannel0023: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc noqueue state UP mode DEFAULT group default qlen 1000
338: Ethernet56: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc pfifo_fast master PortChannel0023 state UP mode DEFAULT group default qlen 1000
339: Ethernet60: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9100 qdisc pfifo_fast master PortChannel0023 state UP mode DEFAULT group default qlen 1000

Steps to reproduce the issue:

  1. Run systemctl stop teamd to stop teamd service on DUT
  2. Check if all portchannels are cleared
ip link show | grep PortChannel 

And we can see that PortChannel0023 still exists.
3.

Describe the results you received:
All portchannels should be cleared from system.

Describe the results you expected:
The last portchannel PortChannel0023 still exists.

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**
SONiC Software Version: SONiC.HEAD.117-31ce20ac
Distribution: Debian 10.7
Kernel: 4.19.0-9-2-amd64
Build commit: 31ce20ac
Build date: Thu Dec 10 14:08:25 UTC 2020
Built by: johnar@jenkins-worker-22

Platform: x86_64-cel_seastone-r0
HwSKU: Celestica-DX010-C32
ASIC: broadcom
ASIC Count: 1
Serial Number: DX010F2B118711MS100005
Uptime: 02:51:29 up 12:51,  1 user,  load average: 3.87, 4.21, 4.24

Docker images:
REPOSITORY                    TAG                 IMAGE ID            SIZE
docker-teamd                  HEAD.117-31ce20ac   2972c56e4f1d        493MB
docker-teamd                  latest              2972c56e4f1d        493MB
docker-nat                    HEAD.117-31ce20ac   ababcb6276c8        496MB
docker-nat                    latest              ababcb6276c8        496MB
docker-orchagent              HEAD.117-31ce20ac   b34e14ffeee9        507MB
docker-orchagent              latest              b34e14ffeee9        507MB
docker-fpm-frr                HEAD.117-31ce20ac   201df8c78679        509MB
docker-fpm-frr                latest              201df8c78679        509MB
docker-sflow                  HEAD.117-31ce20ac   d93a01c63b33        494MB
docker-sflow                  latest              d93a01c63b33        494MB
docker-snmp                   HEAD.117-31ce20ac   78de9463d358        486MB
docker-snmp                   latest              78de9463d358        486MB
docker-dhcp-relay             HEAD.117-31ce20ac   2ed7f4da821a        456MB
docker-dhcp-relay             latest              2ed7f4da821a        456MB
docker-sonic-mgmt-framework   HEAD.117-31ce20ac   2a62a96c5201        610MB
docker-sonic-mgmt-framework   latest              2a62a96c5201        610MB
docker-router-advertiser      HEAD.117-31ce20ac   06941a16f7b2        450MB
docker-router-advertiser      latest              06941a16f7b2        450MB
docker-platform-monitor       HEAD.117-31ce20ac   6d3429098bf1        574MB
docker-platform-monitor       latest              6d3429098bf1        574MB
docker-lldp                   HEAD.117-31ce20ac   b225be0aff21        490MB
docker-lldp                   latest              b225be0aff21        490MB
docker-database               HEAD.117-31ce20ac   93455de91801        449MB
docker-database               latest              93455de91801        449MB
docker-sonic-telemetry        HEAD.117-31ce20ac   b9d5a761e9ce        524MB
docker-sonic-telemetry        latest              b9d5a761e9ce        524MB
docker-syncd-brcm             HEAD.117-31ce20ac   ab416184372d        542MB
docker-syncd-brcm             latest              ab416184372d        542MB
**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
@bingwang-ms bingwang-ms changed the title PortChannel is not fully cleared from system PortChannel is not fully cleared when teamd is stopped Dec 14, 2020
@daall daall added the P1 Priority of the issue, lower than P0 label Dec 23, 2020
@yxieca
Copy link
Contributor

yxieca commented Jan 6, 2021

@judyjoseph please take a look.

@judyjoseph
Copy link
Contributor

@bingwang-ms , I see this issue in master image, Investigating further.

Meanwhile I tried ~20 times with 201911 build, I cannot see this issue. Do you also have a similar finding ?

lguohan pushed a commit that referenced this issue Jan 24, 2021
…annels. (#6537)

The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes #6199
To check if it works out for this issue in 201911 ? #6503

This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.

master

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m15.599s
    user    0m0.061s
    sys     0m0.038s
Sonic 201911.v58

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m5.541s
    user    0m0.020s
    sys     0m0.028s
daall pushed a commit that referenced this issue Feb 6, 2021
…annels. (#6537)

The Portchannels were not getting cleaned up as the cleanup activity was taking more than 10 secs which is default docker timeout after which a SIGKILL will be send.
Fixes #6199
To check if it works out for this issue in 201911 ? #6503

This issue is significantly seen in master branch compared to 201911 because the Portchannel cleanup takes more time in master. Test on a DUT with 8 Port Channels.

master

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m15.599s
    user    0m0.061s
    sys     0m0.038s
Sonic 201911.v58

    admin@str-s6000-acs-8:~$ time sudo systemctl stop teamd
    real    0m5.541s
    user    0m0.020s
    sys     0m0.028s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue for 202012 Master Branch Quality P1 Priority of the issue, lower than P0
Projects
None yet
4 participants