[Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios #2763

stephenxs · 2023-05-05T09:22:27Z

What I did
Enhance the logic to check maximum headroom exceeding to cover corner scenarios

Currently, the logic to check the maximum headroom exceeding works well when a user changes any buffer configuration in the dynamic buffer model, preventing all problematic configurations from being applied to the ASIC.
However, it can fail when a problematic configuration is config_db.json and config reload is executed. To cover this scenario, the following actions need to be done:

Take the pending PG keys and buffer profiles into account when calculating the maximum headroom
- Existing buffer PGs and buffer profiles can be in the pending queue since there are a large number of notifications needed to be handled during system initialization, which takes time.
Take the lossy PG into account when calculating the maximum headroom.
- Non-default lossy PG can be added by the user in config_db.json
Pass the PG to the Lua plugin when refreshing PGs for a port

Signed-off-by: Stephen Sun stephens@nvidia.com

Why I did it

Cover corner scenarios.

How I verified it

Manual test, vs test, regression test (dynamic buffer)

Details if related

stephenxs · 2023-05-08T09:03:44Z

All test cases are failed due to the following error which should be fixed by sonic-net/sonic-utilities#2830

2023-05-08T08:09:14.5589099Z         # conn.request() calls http.client.*.request, not the method in
2023-05-08T08:09:14.5589452Z         # urllib3.request. It also calls makefile (recv) on the socket.
2023-05-08T08:09:14.5589655Z         try:
2023-05-08T08:09:14.5589756Z >           conn.request(
2023-05-08T08:09:14.5589871Z                 method,
2023-05-08T08:09:14.5590134Z                 url,
2023-05-08T08:09:14.5590232Z                 body=body,
2023-05-08T08:09:14.5590350Z                 headers=headers,
2023-05-08T08:09:14.5590462Z                 chunked=chunked,
2023-05-08T08:09:14.5590591Z                 preload_content=preload_content,
2023-05-08T08:09:14.5590721Z                 decode_content=decode_content,
2023-05-08T08:09:14.5590869Z                 enforce_content_length=enforce_content_length,
2023-05-08T08:09:14.5590986Z             )
2023-05-08T08:09:14.5591229Z E           TypeError: request() got an unexpected keyword argument 'chunked'

stephenxs · 2023-05-08T09:10:49Z

cherry-pickable to 202211 based on the lastest commit a484ab8684c45d56340bbb3a704ae6ca039e9282

liat-grozovik · 2023-05-08T18:29:32Z

/azp run Azure.sonic-swss

azure-pipelines · 2023-05-08T18:29:41Z

Azure Pipelines successfully started running 1 pipeline(s).

stephenxs · 2023-05-09T07:11:03Z

The same error as before

2023-05-09T04:05:18.8163888Z         # conn.request() calls http.client.*.request, not the method in
2023-05-09T04:05:18.8164229Z         # urllib3.request. It also calls makefile (recv) on the socket.
2023-05-09T04:05:18.8164485Z         try:
2023-05-09T04:05:18.8164690Z >           conn.request(
2023-05-09T04:05:18.8164887Z                 method,
2023-05-09T04:05:18.8165086Z                 url,
2023-05-09T04:05:18.8165277Z                 body=body,
2023-05-09T04:05:18.8165491Z                 headers=headers,
2023-05-09T04:05:18.8165720Z                 chunked=chunked,
2023-05-09T04:05:18.8165963Z                 preload_content=preload_content,
2023-05-09T04:05:18.8166233Z                 decode_content=decode_content,
2023-05-09T04:05:18.8166525Z                 enforce_content_length=enforce_content_length,
2023-05-09T04:05:18.8166774Z             )
2023-05-09T04:05:18.8167228Z E           TypeError: request() got an unexpected keyword argument 'chunked'
2023-05-09T04:05:18.8170916Z 
2023-05-09T04:05:18.8171542Z /usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:496: TypeError

1. Take the pending PG keys into consideration when calculating the maximum headroom 2. Pass the PG to the Lua plugin when refreshing PGs for a port Signed-off-by: Stephen Sun <stephens@nvidia.com>

Signed-off-by: Stephen Sun <stephens@nvidia.com>

keboliu · 2023-05-10T05:28:06Z

@stephenxs maybe this PR can help to unblock the checker? #2767

stephenxs · 2023-05-10T05:32:03Z

@stephenxs maybe this PR can help to unblock the checker? #2767

thanks, @keboliu but I have already done it in the morning and the test failed in my case. checking.

…seconds Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs · 2023-05-10T14:17:34Z

/azpw run

mssonicbld · 2023-05-10T14:17:36Z

/AzurePipelines run

azure-pipelines · 2023-05-10T14:17:46Z

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs · 2023-05-11T09:04:12Z

It turns out to be that the test failed on the Azure pipeline because the config_db updates were received in a different order in which they were received in my local vs testbed.
Add checkers to guarantee the order.

Signed-off-by: Stephen Sun <stephens@nvidia.com>

This reverts commit c12a9e5.

stephenxs · 2023-05-12T06:37:37Z

/azpw run

mssonicbld · 2023-05-12T06:37:39Z

/AzurePipelines run

azure-pipelines · 2023-05-12T06:37:49Z

Azure Pipelines successfully started running 1 pipeline(s).

stephenxs · 2023-05-12T06:42:48Z

Failed due to irrelevant test cases. Retriggered

2023-05-12T05:33:32.4509401Z test_virtual_chassis.py::TestVirtualChassis::test_chassis_add_remove_ports FAILED [ 65%]
2023-05-12T05:35:34.6375172Z test_vlan.py::TestVlan::test_VlanMemberLinkDown FAILED                   [ 80%]

cfgmgr/buffer_check_headroom_mellanox.lua

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs · 2023-05-25T00:54:00Z

@StormLiangMS could you please cherry-pick the PR to 202211? thanks.

…eeding to cover corner scenarios (#2763) What I did Enhance the logic to check maximum headroom exceeding to cover corner scenarios Currently, the logic to check the maximum headroom exceeding works well when a user changes any buffer configuration in the dynamic buffer model, preventing all problematic configurations from being applied to the ASIC. However, it can fail when a problematic configuration is config_db.json and config reload is executed. To cover this scenario, the following actions need to be done: Take the pending PG keys and buffer profiles into account when calculating the maximum headroom Existing buffer PGs and buffer profiles can be in the pending queue since there are a large number of notifications needed to be handled during system initialization, which takes time. Take the lossy PG into account when calculating the maximum headroom. Non-default lossy PG can be added by the user in config_db.json Pass the PG to the Lua plugin when refreshing PGs for a port Signed-off-by: Stephen Sun stephens@nvidia.com Why I did it Cover corner scenarios. How I verified it Manual test, vs test, regression test (dynamic buffer)

stephenxs changed the title ~~Fix issue in maximum headroom checking~~ [QoS] Enhance the logic to check maximum headroom exceeding to cover corner scenarios May 8, 2023

stephenxs added the Request for 202211 Branch label May 8, 2023

keboliu mentioned this pull request May 8, 2023

[build] Fix base OS compilation issue caused by incompatibility of urllib with requests. sonic-net/sonic-utilities#2830

Merged

stephenxs marked this pull request as ready for review May 8, 2023 13:31

stephenxs requested review from neethajohn and prsunny as code owners May 8, 2023 13:31

stephenxs added 3 commits May 10, 2023 09:38

Fix issue in maximum headroom checking

87d3060

1. Take the pending PG keys into consideration when calculating the maximum headroom 2. Pass the PG to the Lua plugin when refreshing PGs for a port Signed-off-by: Stephen Sun <stephens@nvidia.com>

Check maximum headroom for lossy PG

a88a8bc

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Mock test to cover the maximum headroom exceeding checking enhancement

f1b589e

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs force-pushed the fix-headroom-check branch from 5b9dd0f to f1b589e Compare May 10, 2023 01:38

Stablize the test using wait_for_field_match which can wait for more …

c726bbe

…seconds Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs added 3 commits May 11, 2023 03:50

Fix typo

8a053c8

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Add debug info

4feca89

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Detailed log

c12a9e5

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs added 5 commits May 11, 2023 12:04

Guarantee the order

c7779a7

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Fix issue: wrong profile is take if no new_pg is provided

6fcd0b2

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Fix typo

4f6236f

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Add test case to cover the latest commit

6bad0d3

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Revert "Detailed log"

c36bd2a

This reverts commit c12a9e5.

neethajohn reviewed May 17, 2023

View reviewed changes

cfgmgr/buffer_check_headroom_mellanox.lua Show resolved Hide resolved

cfgmgr/buffer_check_headroom_mellanox.lua Outdated Show resolved Hide resolved

Remove redundant code

0c63232

Signed-off-by: Stephen Sun <stephens@nvidia.com>

neethajohn approved these changes May 18, 2023

View reviewed changes

stephenxs changed the title ~~[QoS] Enhance the logic to check maximum headroom exceeding to cover corner scenarios~~ [Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios May 19, 2023

neethajohn merged commit fe8c395 into sonic-net:master May 22, 2023

stephenxs deleted the fix-headroom-check branch May 22, 2023 20:29

StormLiangMS added the Included in 202211 Branch label May 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios #2763

[Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios #2763

stephenxs commented May 5, 2023 •

edited

Loading

stephenxs commented May 8, 2023

stephenxs commented May 8, 2023

liat-grozovik commented May 8, 2023

azure-pipelines bot commented May 8, 2023

stephenxs commented May 9, 2023

keboliu commented May 10, 2023

stephenxs commented May 10, 2023 •

edited

Loading

stephenxs commented May 10, 2023

mssonicbld commented May 10, 2023

azure-pipelines bot commented May 10, 2023

stephenxs commented May 11, 2023

stephenxs commented May 12, 2023

mssonicbld commented May 12, 2023

azure-pipelines bot commented May 12, 2023

stephenxs commented May 12, 2023

stephenxs commented May 25, 2023

[Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios #2763

[Dynamic Buffer Calc] Enhance the logic to check maximum headroom exceeding to cover corner scenarios #2763

Conversation

stephenxs commented May 5, 2023 • edited Loading

stephenxs commented May 8, 2023

stephenxs commented May 8, 2023

liat-grozovik commented May 8, 2023

azure-pipelines bot commented May 8, 2023

stephenxs commented May 9, 2023

keboliu commented May 10, 2023

stephenxs commented May 10, 2023 • edited Loading

stephenxs commented May 10, 2023

mssonicbld commented May 10, 2023

azure-pipelines bot commented May 10, 2023

stephenxs commented May 11, 2023

stephenxs commented May 12, 2023

mssonicbld commented May 12, 2023

azure-pipelines bot commented May 12, 2023

stephenxs commented May 12, 2023

stephenxs commented May 25, 2023

stephenxs commented May 5, 2023 •

edited

Loading

stephenxs commented May 10, 2023 •

edited

Loading