Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dynamic buffer][Mellanox] Fix issue: accumulative headroom can exceed limit in rare scenario #2020

Merged
merged 1 commit into from
Nov 16, 2021

Conversation

stephenxs
Copy link
Collaborator

@stephenxs stephenxs commented Nov 10, 2021

What I did
The egress mirror size is not taken into account when checking the accumulative headroom size, which causes the estimated accumulate headroom size to be greater than the real one and unable to be applied.

Why I did it
Bug fix

How I verified it
Run regression test.

Details if related
An example to explain the logic .
The current accumulative headroom is 90k, egress mirror headroom is 10k, the ASIC limitation of the port is 100k.
Now a user wants to change the port configuration which will make the accumulative headroom to be increased by 10k.
Now the accumulative headroom estimated by buffer manager is 100k, which doesn’t exceed the limitation. However in the ASIC the real accumulative headroom is 110k which does.

The egress mirror size is not accurate,
causing the estimated accumulative headroom size to be greater
than the real one

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs
Copy link
Collaborator Author

/azpw run

@mssonicbld
Copy link
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stephenxs
Copy link
Collaborator Author

stephenxs commented Nov 11, 2021

Just checked, it can be cherry-picked to 202106 and 202012 branches smoothly. So tags added.
It was tested based on the following commits
202106

commit a801b287eab9925200c1ad672eac057c2cf627ef
Author: Stephen Sun <stephens@nvidia.com>
Date:   Wed Nov 10 00:45:38 2021 +0000

    Fix issue: accumulative headroom can exceed limit in rare scenario
    
    The egress mirror size is not accurate,
    causing the estimated accumulative headroom size to be greater
    than the real one
    
    Signed-off-by: Stephen Sun <stephens@nvidia.com>

commit 15074ac4727d5a844155ca056ace785418450054
Author: Alpesh Patel <alps.oss@gmail.com>
Date:   Tue Nov 9 14:51:26 2021 -0500

    [sonic-swss]:enable unconfiguring PFC on last TC on a port (#1962)
    
    * enable unconfiguring PFC on last TC on a port
    Signed-off-by: Alpesh S Patel <alpesh@cisco.com>

202012

commit b97c5c6d2dd43bd4b9114d599b2db27af8ef34ec
Author: Stephen Sun <stephens@nvidia.com>
Date:   Wed Nov 10 00:45:38 2021 +0000

    Fix issue: accumulative headroom can exceed limit in rare scenario
    
    The egress mirror size is not accurate,
    causing the estimated accumulative headroom size to be greater
    than the real one
    
    Signed-off-by: Stephen Sun <stephens@nvidia.com>

commit 85230fe33994651de44833111446a6022c46cdf6
Author: Junchao-Mellanox <57339448+Junchao-Mellanox@users.noreply.github.com>
Date:   Thu Nov 4 22:12:36 2021 +0800

    [orchagent] Fix group name of port-buffer-drop in flexcounterorch.cpp (#1967)
    
    *Fix group name of port-buffer-drop in flexcounterorch.cpp

@liat-grozovik
Copy link
Collaborator

@neethajohn could you please help to sign off this PR?

@liat-grozovik liat-grozovik merged commit 24a615b into sonic-net:master Nov 16, 2021
@keboliu
Copy link
Collaborator

keboliu commented Nov 17, 2021

@judyjoseph would you please help to cherry-pick?

@stephenxs stephenxs deleted the fix-egress-mirror branch November 17, 2021 02:23
arlakshm pushed a commit that referenced this pull request Nov 18, 2021
…2020)

- What I did
The egress mirror size is not taken into account when checking the accumulative headroom size, which causes the estimated accumulate headroom size to be greater than the real one and unable to be applied.

- Why I did it
Bug fix

- How I verified it
Run regression test.

- Details if related
An example to explain the logic .
The current accumulative headroom is 90k, egress mirror headroom is 10k, the ASIC limitation of the port is 100k.
Now a user wants to change the port configuration which will make the accumulative headroom to be increased by 10k.
Now the accumulative headroom estimated by buffer manager is 100k, which doesn’t exceed the limitation. However in the ASIC the real accumulative headroom is 110k which does.
qiluo-msft pushed a commit that referenced this pull request Nov 20, 2021
…2020)

- What I did
The egress mirror size is not taken into account when checking the accumulative headroom size, which causes the estimated accumulate headroom size to be greater than the real one and unable to be applied.

- Why I did it
Bug fix

- How I verified it
Run regression test.

- Details if related
An example to explain the logic .
The current accumulative headroom is 90k, egress mirror headroom is 10k, the ASIC limitation of the port is 100k.
Now a user wants to change the port configuration which will make the accumulative headroom to be increased by 10k.
Now the accumulative headroom estimated by buffer manager is 100k, which doesn’t exceed the limitation. However in the ASIC the real accumulative headroom is 110k which does.
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
What I did
During config update, update of certain tables do demand service restart.
With multiple related updates are not grouped together, this might result in too many service restarts, which could fail with "hitting start limit". When that happens, call reset-failed, try to restart. If it fails again, take a pause and try to restart again.

How I did it
When service restart fails, call reset-failed, try, pause and then call service restart again.
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
…-net#2028)

What I did
Missed update from review comments in PR sonic-net#2020
s/os.system("sleep 10s")/time.sleep(10)/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants