Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid attaching lossless buffer profiles for internal ports #97

Closed
wants to merge 10 commits into from

Conversation

vivekrnv
Copy link
Owner

@vivekrnv vivekrnv commented Apr 26, 2024

What I did

  1. Internal ports of smartswitch will not have RDMA traffic and so we need not apply lossless buffer profiles and need not enable PFC. Refer 6accbf5 for a better visualization of the changes in buffer_default_objects.j2

  2. Update the Mellanox-SN4700-O28 SKU buffer profiles to match the following config. Topology inferred from t1-28-lag topology (https://github.com/sonic-net/sonic-mgmt/pull/9837) .

Port configuration Value
Breakout mode for each port Defined in port mapping
Speed of the port Defined in Port mapping
Internal Ports Defined in Port mapping
Buffer configuration Value
Shared headroom Enabled
Shared headroom pool factor 2
Dynamic Buffer Disable
In static buffer scenario how many uplinks and downlinks? 8 1x400G uplinks and 20 1x400G downlinks

Port Mapping

Ports Mode
1-8 1x400G
9-28 1x400G
28-32 1x200G (Internal Ports connected to DPU)

Number of Uplinks / Downlinks:

T1 topology:
Length of downlink: 40m
Length of uplink: 300m

Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

  1. Unit Tests

  2. Verify IP traffic between DPU and NPU through internal ports after applying buffer profiles

root@r-leopard-79:/home/admin# show int status
      Interface                            Lanes    Speed    MTU    FEC    Alias             Vlan    Oper    Admin                                             Type    Asym PFC
---------------  -------------------------------  -------  -----  -----  -------  ---------------  ------  -------  -----------------------------------------------  ----------
    Ethernet224  224,225,226,227,228,229,230,231     200G   9100    N/A    etp29           routed      up       up                                DPU-NPU Data Port         off
    Ethernet232  232,233,234,235,236,237,238,239     200G   9100    N/A    etp30           routed      up       up                                DPU-NPU Data Port         off
    Ethernet240  240,241,242,243,244,245,246,247     200G   9100    N/A    etp31           routed      up       up                                DPU-NPU Data Port         off
    Ethernet248  248,249,250,251,252,253,254,255     200G   9100    N/A    etp32           routed      up       up                                DPU-NPU Data Port         off

Run config qos reload and verify Buffer tables:

"BUFFER_PG": {
       ....................
        "Ethernet224|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet232|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet240|3-4": {
            "profile": "ingress_lossy_profile"
        },
        "Ethernet248|3-4": {
            "profile": "ingress_lossy_profile"
        }
    },

"BUFFER_QUEUE": {
       "Ethernet224|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet232|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet240|3-4": {
            "profile": "q_lossy_profile"
        },
        "Ethernet248|3-4": {
            "profile": "q_lossy_profile"
        },
    },

"BUFFER_PORT_INGRESS_PROFILE_LIST": {
        "Ethernet224": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet232": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet240": {
            "profile_list": "ingress_lossy_profile"
        },
        "Ethernet248": {
            "profile_list": "ingress_lossy_profile"
        }
    },

"BUFFER_PORT_EGRESS_PROFILE_LIST": {
        "Ethernet224": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet232": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet240": {
            "profile_list": "egress_lossy_profile"
        },
        "Ethernet248": {
            "profile_list": "egress_lossy_profile"
        }
}

"PORT_QOS_MAP": {
       "Ethernet224": {
            "dscp_to_tc_map": "AZURE",
            "pfc_to_pg_map": "AZURE",
            "pfc_to_queue_map": "AZURE",
            "tc_to_pg_map": "AZURE",
            "tc_to_queue_map": "AZURE"
        },
}

QUEUE: {
     "Ethernet224|3": {
            "scheduler": "scheduler.0"
      },
     "Ethernet224|4": {
            "scheduler": "scheduler.0"
     },
}

swss.rec output for traditional mode

2024-04-30.21:51:29.193211|BUFFER_POOL_TABLE:egress_lossless_pool|SET|mode:dynamic|size:60817392|type:egress                                                                                                                           
2024-04-30.21:51:29.193238|BUFFER_POOL_TABLE:egress_lossy_pool|SET|mode:dynamic|size:50270208|type:egress                                                                                                                              
2024-04-30.21:51:29.193258|BUFFER_POOL_TABLE:ingress_lossless_pool|SET|mode:dynamic|size:50270208|type:ingress|xoff:5611520                                                                                                            
2024-04-30.21:51:29.223414|BUFFER_PROFILE_TABLE:pg_lossless_400000_300m_profile|SET|dynamic_th:0|pool:ingress_lossless_pool|size:38912|xoff:358400|xon:38912                                                                           
2024-04-30.21:51:29.223454|BUFFER_PROFILE_TABLE:egress_lossless_profile|SET|dynamic_th:7|pool:egress_lossless_pool|size:0                                                                                                              
2024-04-30.21:51:29.223474|BUFFER_PROFILE_TABLE:pg_lossless_400000_40m_profile|SET|dynamic_th:0|pool:ingress_lossless_pool|size:38912|xoff:144384|xon:38912                                                                            
2024-04-30.21:51:29.223490|BUFFER_PROFILE_TABLE:ingress_lossless_profile|SET|dynamic_th:7|pool:ingress_lossless_pool|size:0                                                                                                            
2024-04-30.21:51:29.223524|BUFFER_PROFILE_TABLE:ingress_lossy_profile|SET|dynamic_th:3|pool:ingress_lossless_pool|size:0                                                                                                               
2024-04-30.21:51:29.223546|BUFFER_PROFILE_TABLE:egress_lossy_profile|SET|dynamic_th:7|pool:egress_lossy_pool|size:9216                                                                                                                 
2024-04-30.21:51:29.223564|BUFFER_PROFILE_TABLE:q_lossy_profile|SET|dynamic_th:3|pool:egress_lossy_pool|size:0 

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv marked this pull request as ready for review April 27, 2024 00:34
@vivekrnv
Copy link
Owner Author

@stephenxs please review

@vivekrnv vivekrnv removed the request for review from dgsudharsan April 27, 2024 00:37
@vivekrnv vivekrnv marked this pull request as draft April 27, 2024 00:52
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@vivekrnv vivekrnv marked this pull request as ready for review April 30, 2024 22:44
@stephenxs
Copy link

stephenxs commented May 7, 2024

In addition,

  1. We should consider the DPC ports since they consume buffers for lossy PGs, queues, mgmt PGs etc. Currently, such a scenario is not completely supported by the excel. I modified it and shared it in the review meeting chat box
  2. I suggest adding a unit test to cover the change in templates. See [Mellanox] Support DSCP remapping in dual ToR topo on T0 switch sonic-net/sonic-buildimage#12605 as a reference

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@vivekrnv
Copy link
Owner Author

In addition,

  1. We should consider the DPC ports since they consume buffers for lossy PGs, queues, mgmt PGs etc. Currently, such a scenario is not completely supported by the excel. I modified it and shared it in the review meeting chat box
  2. I suggest adding a unit test to cover the change in templates. See [Mellanox] Support DSCP remapping in dual ToR topo on T0 switch sonic-net/sonic-buildimage#12605 as a reference

Added UT's and updated values. Please check

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@vivekrnv vivekrnv requested a review from stephenxs May 10, 2024 20:57
@vivekrnv vivekrnv closed this May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants