From 0669c622e592101a55f4305c8534092bebbb966d Mon Sep 17 00:00:00 2001 From: Arun Saravanan Balachandran <52521751+ArunSaravananBalachandran@users.noreply.github.com> Date: Tue, 10 Nov 2020 18:56:37 +0000 Subject: [PATCH] Platform API for PCIe AER stats collection (#702) Update the PCIed hld to add the platform API definition for PCIe AER stats collection --- doc/pcie-mon/pcie-monitoring-services-hld.md | 36 ++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/doc/pcie-mon/pcie-monitoring-services-hld.md b/doc/pcie-mon/pcie-monitoring-services-hld.md index 83c3b9deba..ea411585fd 100644 --- a/doc/pcie-mon/pcie-monitoring-services-hld.md +++ b/doc/pcie-mon/pcie-monitoring-services-hld.md @@ -1,6 +1,6 @@ # SONiC PCIe Monitoring services HLD # -### Rev 0.3 ### +### Rev 0.4 ### ### Revision | Rev | Date | Author | Change Description | @@ -10,6 +10,7 @@ | | | | Add pcied to PMON for runtime monitoring | | 0.3 | | Arun Saravanan Balachandran | Add AER stats update support in pcied | | | | | Add command to display AER stats | + | 0.4 | | Arun Saravanan Balachandran | Add platform API to collect AER stats | ## About This Manual ## @@ -127,7 +128,38 @@ For AER supported PCIe device, the AER stats belonging to severities `correctabl ### 2.2 PCIe AER stats collection in pcied ### -For PCIe devices that pass PcieUtil `get_pcie_check`, the AER stats if available will be retrieved and updated in the STATE_DB periodically every minute by pcied. +A common platform API `get_pcie_aer_stats` is defined in class `PcieBase` for retrieving AER stats of a PCIe device: + +``` + @abc.abstractmethod + def get_pcie_aer_stats(self, domain, bus, dev, fn): + """ + Returns a nested dictionary containing the AER stats belonging to a + PCIe device + + Args: + domain, bus, dev, fn: Domain, bus, device, function of the PCIe + device respectively + + Returns: + A nested dictionary where key is severity 'correctable', 'fatal' or + 'non_fatal', value is a dictionary of key, value pairs in the format: + {'AER Error type': Error count} + + Ex. {'correctable': {'BadDLLP': 0, 'BadTLP': 0}, + 'fatal': {'RxOF': 0, 'MalfTLP': 0}, + 'non_fatal': {'RxOF': 0, 'MalfTLP': 0}} + + For PCIe devices that do not support AER, the value for each severity + key is an empty dictionary. + """ + return {} +``` + +Default `get_pcie_aer_stats`is implemented in PcieUtil class at sonic_platform_base/sonic_pcie/pcie_common.py. +It returns the AER stats for a given PCIe device obtained from the AER sysfs under `/sys/bus/pci/devices/::.` + +For PCIe devices that pass PcieUtil `get_pcie_check`, AER stats will be retrieved using `get_pcie_aer_stats` and updated in the STATE_DB periodically every minute by pcied. ### 2.3 STATE_DB keys and value ###