Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Functional] [LAG] | When first interface is deleted from the portchannel with at least two ports, there is a traffic disruption for around 60 seconds #14381

Open
liorghub opened this issue Mar 22, 2023 · 1 comment
Labels
NVIDIA Triaged this issue has been triaged

Comments

@liorghub
Copy link
Contributor

liorghub commented Mar 22, 2023

Description

The issue happens only once per interface. When interface1 is added back to the LAG, after that we can delete and add the same interface1 again without any issues. But, the issue would be seen after removing interface2. After adding it back, the issue would occur again on interface1.

Looks like, the problem is related to the state when the "master" LAG interface is deleted and the traffic is lost for a while.

Setup description:

SW1 and SW2 connected with 2 links

Steps to reproduce the issue:

On SW1 config the following:
config int speed Ethernet32 10000
config int speed Ethernet40 10000
config portchannel add PortChannel0001
config portchannel member add PortChannel0001 Ethernet32
config portchannel member add PortChannel0001 Ethernet40
config interface ip add PortChannel0001 1.0.0.1/24
config interface ip add PortChannel0001 2001::1/64
On SW2 config the following:
config int speed Ethernet72 10000
config int speed Ethernet80 10000
config portchannel add PortChannel0001
config portchannel member add PortChannel0001 Ethernet72
config portchannel member add PortChannel0001 Ethernet80
config interface ip add PortChannel0001 1.0.0.2/24
config interface ip add PortChannel0001 2001::2/64
Send continuous ping from SW2 to SW1 to the IP 1.0.0.1 or 2001::1
While the ping is going, on SW1 delete the interface Ethernet32 (put the correct name accordingly) from the portchannel:
config portchannel member del PortChannel0001 Ethernet32
Check that on SW2 ping stopped sending for around 60 seconds.
On SW1 add the interface Ethernet32 (put the correct name accordingly) back to the portchannel:
config portchannel member add PortChannel0001 Ethernet32
Check that the issue is not reproduced and the traffic is not dropped.
Repeat previous steps 5-8 with interface2.

Describe the results you received:

Traffic is dropped for around 60 seconds

Describe the results you expected:

No traffic loss should be observed when one port is removed from the LAG

Output of show version:

SONiC-OS-202012_9_RC_1.3-23b38b2a7_Internal

Output of show techsupport:

sonic_dump_SW2.tar.gz
sonic_dump_SW1.tar.gz

Additional information you deem important (e.g. issue happens only occasionally):

@saiarcot895
Copy link
Contributor

saiarcot895 commented Mar 31, 2023

Can you add a test case for this in sonic-mgmt, or open an issue in sonic-mgmt to add a test case for this?

qiluo-msft pushed a commit that referenced this issue Apr 7, 2023
… LAG (#14002)

#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue #14381

#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. 
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.

#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
@vmittal-msft vmittal-msft added Triaged this issue has been triaged MSFT NVIDIA and removed MSFT labels Apr 12, 2023
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Apr 16, 2023
… LAG (sonic-net#14002)

#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue sonic-net#14381

#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. 
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.

#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Apr 19, 2023
… LAG (sonic-net#14002)

#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue sonic-net#14381

#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. 
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.

#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
mssonicbld pushed a commit that referenced this issue Apr 19, 2023
… LAG (#14002)

#### Why I did it
When removing port from LAG while traffic is running thorough LAG there is traffic disruption of 60 seconds.
Fix issue #14381

#### How I did it
The patch I added introduces "port_removing" op and call it right before Kernel is asked to remove the port. 
Implement the op in LACP runner to disable the port which leads to proper LACPDU send.

#### How to verify it
Set LAG between 2 switches.
Set LAGs to be router port and set ip address.
In switch A send ping to ip address of LAG in switch B.
In switch B, while ping is running remove port from LAG.
Verify ping is not stopping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NVIDIA Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

3 participants