Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teamd][warmreboot] teamd fails to restore PortChannel state sporadically after warm reboot #3649

Closed
stepanblyschak opened this issue Oct 22, 2019 · 3 comments · Fixed by #4109

Comments

@stepanblyschak
Copy link
Collaborator

stepanblyschak commented Oct 22, 2019

teamd.txt

Description

Steps to reproduce the issue:

  1. sudo warm-reboot -v
  2. repeat until use see issue with LAG flapping
  3. zgrep ERR teamd.log

Part of teamd log

Oct 22 14:04:03.827944 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Ethernet116: Failed to find "enabled" option.
Oct 22 14:04:03.827984 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: get_ifinfo_list: check_call_change_handers failed
Oct 22 14:04:03.828018 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Failed to get interface information list.
Oct 22 14:04:03.828056 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Failed to refresh interface information list.
Oct 22 14:04:03.828096 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Ethernet116: Team refresh failed.
Oct 22 14:04:03.829959 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Failed to init port priv.
Oct 22 14:04:03.833389 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: ioctl SIOCDELMULTI failed.
Oct 22 14:04:03.839081 arc-switch1004 ERR teamd#teamd_PortChannel0002[36]: Port with interface index "76" is not part of this device.

Describe the results you received:

LAG flaps sometimes after warm reboot

Describe the results you expected:

teamd restores LAG

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

```

SONiC Software Version: SONiC.HEAD.106-6cb445cb
Distribution: Debian 9.11
Kernel: 4.9.0-9-2-amd64
Build commit: 6cb445c
Build date: Mon Oct 21 08:11:44 UTC 2019
Built by: johnar@jenkins-worker-4

Platform: x86_64-mlnx_msn2700-r0
HwSKU: ACS-MSN2700
ASIC: mellanox
Serial Number: MT1822K07823
Uptime: 15:32:25 up 11 min, 1 user, load average: 3.37, 3.45, 2.38

```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
@pavel-shirshov
Copy link
Contributor

Please check #3725

@pavel-shirshov
Copy link
Contributor

Please check #3753
I still can't reproduce the issue with/or without the latest change

@stephenxs
Copy link
Collaborator

Hi @pavel-shirshov,
We found that this issue can also take place after a cold reboot. When it takes place after a cold reboot, the error message is the same.

lguohan pushed a commit that referenced this issue Feb 5, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes #4070
Fixes #3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
prsunny pushed a commit that referenced this issue Feb 11, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes #4070
Fixes #3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
abdosi pushed a commit that referenced this issue Feb 14, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes #4070
Fixes #3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
pphuchar pushed a commit to SONIC-DEV/sonic-buildimage that referenced this issue Mar 9, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes sonic-net#4070
Fixes sonic-net#3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this issue Apr 24, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes sonic-net#4070
Fixes sonic-net#3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
yxieca pushed a commit that referenced this issue Oct 12, 2020
- What I did
Ported a fix from libteam master to our master.
Fixes #4070
Fixes #3649

- How I did it
Applied patch jpirko/libteam@c723737 from upstream.

- How to verify it
Build image for your DUT and warm-reboot your DUT 10 times. Check that all PortChannels are up and no error messages in teamd.log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants