-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix issue: sometimes PFC WD unable to create zero buffer pool (#2164)
What I did Fix issue: sometimes PFC WD is unable to create zero buffer pool. On some platforms, an ingress/egress zero buffer profile will be applied on the PG and queue which are under PFC storm. The zero buffer profile is created based on zero buffer pool. However, sometimes it fails to create zero buffer pool due to too many buffer pools existing in the system. Sometimes, there is a zero buffer pool existing on the system for reclaiming buffer. In that case, we can leverage it to create zero buffer profile for PFC WD. Why I did it Fix the issue via sharing the zero buffer pool between PFC WD and buffer orchagent How I verified it Manually test Run PFC WD test and PFC WD warm reboot test Run unit test Details if related The detailed flow is like this: PFC Storm detected: If there is a zero pool in PFC WD's cache, just create the zero buffer profile based on it Otherwise, fetching the zero pool from buffer orchagent If got one, create the zero buffer profile based on it Otherwise, create a zero buffer pool notify the zero buffer pool about the buffer orch In both cases, PFC WD should notify buffer orch to increase the reference number of the zero buffer pool. Buffer orchagent: When creating the zero buffer pool, check whether there is one. if yes, skip the SAI API create_buffer_pool increase the reference number. Before removing the zero buffer pool, decrease and check the reference number. if it is zero (after decreased), skip SAI API destroy_buffer_pool. When PFC WD decrease reference number: remove the zero buffer pool if the reference number becomes zero Notes We do not leverage the object_reference_map infrastructure to track the dependency because: it assumes the dependency will eventually be removed if an object is removed. that's NOT true in this scenario because the PFC storm can last for a relatively long time and even cross warm reboot. the interfaces differ. Signed-off-by: Stephen Sun <stephens@nvidia.com>
- Loading branch information
Showing
6 changed files
with
531 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.