Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Lag Support Libteam fixes for lb mode load-balance #12360

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

skannan-sonic
Copy link
Contributor

@skannan-sonic skannan-sonic commented Oct 11, 2022

Why I did it

Static port channel support is the feature not existing in SONIC this pull requests handles the changes needed for libteam to support Static port channel with load-balance mode.
sonic-net/SONiC#1039
other dependent pull request

https://github.com/sonic-net/sonic-buildimage/pull/12360
https://github.com/sonic-net/sonic-swss/pull/2486
https://github.com/sonic-net/sonic-utilities/pull/2436

How I did it

Libteam goes to 100% when static lag is configured with load-balance mode due to following change
jpirko/libteam@deadb5b
This was reverted using the following pull request
jpirko/libteam@61efd6d
but sonic still downloads the older change which causes CPU spike when teamd is configured
with loadbalance option for static lag

Reverting the change fixes the issue @madhukar-kamarapu from broadcom provided the above history.

How to verify it

1       Create static port channel with static flag     pass
2       verify static has option flag true or false     pass
3       Add static member see the portchannel is up     pass
4       verify teamd is created with loadbalance option by default
pass
5       Remove last portchannel member check port channel down  pass
6       Remove portchannel member check port channel still up   pass
7       verify teamdctl config dump     pass
8       verify teamdctl state dump      pass
9       shutdown the portchannel check the kernel state pass
10      no shutdown the portchannel check the kernel state      pass
11      "Check the show output matches the review comment
root@sonic:~# show inter port
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not
available,
       S - selected, D - deselected, * - not synced
  No.  Team Dev      Protocol     Ports
-----  ------------  -----------  ------------
    1  PortChannel1  NONE(A)(Up)  Ethernet0(S)
    2  PortChannel2  NONE(A)(Up)  Ethernet8(S)
    4  PortChannel4  NONE(A)(Dw)
"       pass
12      teamnl is set to loadbalance    pass
13      save and reload and verify portchannel is up    pass
14      "docker restart teamd
teamd stopped
swss stopped
syncd stopped

swss started
syncd started
teamd started"  pass

15. verify teamd settles doesnt hog cpu with 100% cpu usage pass

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

jpirko/libteam@deadb5b
   This was reverted using the following change
jpirko/libteam@61efd6d
but sonic still downloads the older change which causes CPU spike when
teamd is configured with loadbalance option for static lag

2. Reverting the change fixes the issue @madhukar-kamarapu from broadcom
provided the above history.

    Test cases :-
            Test cases
    1       Create static port channel with static flag     pass
    2       verify static has option flag true or false     pass
    3       Add static member see the portchannel is up     pass
    4       verify teamd is created with loadbalance option by default
    pass
    5       Remove last portchannel member check port channel down  pass
    6       Remove portchannel member check port channel still up   pass
    7       verify teamdctl config dump     pass
    8       verify teamdctl state dump      pass
    9       shutdown the portchannel check the kernel state pass
    10      no shutdown the portchannel check the kernel state      pass
    11      "Check the show output matches the review comment
    root@sonic:~# show inter port
    Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not
    available,
           S - selected, D - deselected, * - not synced
      No.  Team Dev      Protocol     Ports
    -----  ------------  -----------  ------------
        1  PortChannel1  NONE(A)(Up)  Ethernet0(S)
        2  PortChannel2  NONE(A)(Up)  Ethernet8(S)
        4  PortChannel4  NONE(A)(Dw)
    "       pass
    12      teamnl is set to loadbalance    pass
    13      save and reload and verify portchannel is up    pass
    14      "docker restart teamd
    teamd stopped
    swss stopped
    syncd stopped

    swss started
    syncd started
    teamd started"  pass

    15. verify teamd settles doesnt hog cpu with 100% cpu usage pass
@madhukar-kamarapu
Copy link

Hi,

Since there is already a fix for this issue in the libteam github, we better not apply a SONiC patch for the reported issue (100% CPU for loadbalance runner). Instead we can use the latest libteam version in SONiC.
Note: Maintaining additional patches is always an overhead.

Currently, SONiC uses the 1.30-1 debian version of libteam: https://github.com/sonic-net/sonic-buildimage/blob/master/src/libteam/Makefile#L28

I checked the source of libteam 1.30-1; the teamd_per_port.c routine teamd_port_check_enable() has the fautly code:
https://launchpad.net/ubuntu/+source/libteam/1.30-1
https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/libteam/1.30-1/libteam_1.30.orig.tar.xz

@skannan-sonic
Copy link
Contributor Author

Hi Madhukar, Thanks for the comment but i think it will be risky to move libteam at this time nearing 202211 release ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants