-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mellanox] Avoid attaching lossless buffer profiles for internal ports #18978
Conversation
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
/azpw ms_conflict |
1 similar comment
/azpw ms_conflict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this cannot be unified with ACS-MSN2700?
it is hard to maintain a separate file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for buffers_defaults_t0/t1 j2.
hard to maintain a separate file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already maintain a few different buffer_defaults_objects.j2 files for each scenario, like the file under 2700/D48C8 supports shared headroom and extra queues
the one under 2700/ACS-2700 does not support shared headroom.
So, we already maintain more than one file based on use case. Thus i'd prefer not coupling SmartSwitch changes with existing ones. It easy to maintain this way
buffers_defaults_t0/t1 j2. is anyway specific per SKU so it'll be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan
The main logic of buffer templates is implemented in buffer_defaults_objects.j2
.
we have had 2 different buffer_defaults_objects.j2
s. One for single ingress pool mode + shared headroom and the other for double ingress pool without shared headroom pool.
We do it in this way because it will make the template very difficult to understand and maintain if we combine them into one.
Now it's a similar scenario and we have another buffer_defaults_objects.j2
for the smart switch.
Putting them all together we have only 3 buffer_defaults_objects.j2
s.
As for the buffer_defaults_t0/t1.j2, they are very simple.
The main logic is to define the pool sizes and to invoke macros to define other buffer objects, like PGs, queues, etc.
So, it doesn't look like a challenge to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lguohan, let me know what you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lguohan, A gentle reminder
@lguohan let me know if you are ok to to merge following Vivek feedback. |
@vivekrnv is this needed for 202311 and should be considered as bug fix? |
hi @liat-grozovik , not required for 202311 |
@liat-grozovik , i really concerns on duplicating the templates, this makes the maintaining future maintaining efforts very challenging. |
@lguohan to signoff |
"pfcwd_sw_enable" : "3,4" | ||
"pfcwd_sw_enable" : "3,4", | ||
{% endif %} | ||
"tc_to_pg_map" : "AZURE", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tc_to_pg_map and pfc_to_queue_map is also not required for internal ports. Is this excluded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tc_to_pg_map is required for internal ports.
You mean pfc_to_pg_map
and pfc_to_queue_map
? Then yes, these two are not required but however having them wouldn't cause any issues IMO. We are configuring the map without enabling pfc on any of the pg's. Let me know if you think i should remove these two as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @vivekrnv. It's harmless to keep these mappings the same between internal ports and normal ports but the logic will be much simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vivekrnv @stephenxs lossless buffer profiles are not attached to internal ports, so lossless functionality is not possible on these internal ports.
If it is to keep the config uniform across all the ports, this is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the idea is to keep the maps uniform across all the ports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kperumalbfn, Can we sign this off?
LGTM |
sonic-net#18978) [Mellanox] Avoid attaching lossless buffer profiles for internal ports (sonic-net#18978) Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Cherry-pick PR to 202405: #19768 |
What I did
Internal ports of smartswitch will not have RDMA traffic and so we need not apply lossless buffer profiles and need not enable PFC. Refer vivekrnv@6accbf5 for a better visualization of the changes in buffer_default_objects.j2
Update the Mellanox-SN4700-O28 SKU buffer profiles to match the following config. Topology inferred from t1-28-lag topology (Add t1-28-lag topology sonic-mgmt#9837) .
Port Mapping
Number of Uplinks / Downlinks:
T1 topology:
Length of downlink: 40m
Length of uplink: 300m
Work item tracking
How I did it
How to verify it
Unit Tests
Verify IP traffic between DPU and NPU through internal ports after applying buffer profiles
Run config qos reload and verify Buffer tables:
swss.rec output for traditional mode
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)