Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[9332][fast-reboot] fast-reboot fails on 9332 due to downtime > 30s #9165

Closed
tjchadaga opened this issue Nov 3, 2021 · 1 comment
Closed

Comments

@tjchadaga
Copy link
Contributor

Description

Fast-reboot fails on 9332 due to dataplane downtime greater than 30s (~97s). This is caused by all ports going through an additional flap during fast-reboot as a result of port serdes programming being done after the port comes up

Steps to reproduce the issue:

  1. Run advanced_reboot/test_advanced_reboot.py::test_fast_reboot

Describe the results you received:

2021-11-02T18:48:28.4187932Z         "FAIL", 
2021-11-02T18:48:28.4190760Z         "", 
2021-11-02T18:48:28.4194127Z         "======================================================================", 
2021-11-02T18:48:28.4197615Z         "FAIL: advanced-reboot.ReloadTest", 
2021-11-02T18:48:28.4200939Z         "----------------------------------------------------------------------", 
2021-11-02T18:48:28.4204293Z         "Traceback (most recent call last):", 
2021-11-02T18:48:28.4207330Z         "  File \"ptftests/advanced-reboot.py\", line 1216, in runTest", 
2021-11-02T18:48:28.4210624Z         "    self.handle_post_reboot_test_reports()", 
2021-11-02T18:48:28.4214087Z         "  File \"ptftests/advanced-reboot.py\", line 1165, in handle_post_reboot_test_reports", 
2021-11-02T18:48:28.4217070Z         "    self.assertTrue(is_good, errors)", 
2021-11-02T18:48:28.4219767Z         "AssertionError: ", 
2021-11-02T18:48:28.4221897Z         "", 
2021-11-02T18:48:28.4224050Z         "Something went wrong. Please check output below:", 
2021-11-02T18:48:28.4226412Z         "", 
2021-11-02T18:48:28.4228889Z         "FAILED:dut:Total downtime period must be less then 0:00:30 seconds. It was 96.4807388783", 
2021-11-02T18:48:28.4231386Z         "", 
2021-11-02T18:48:28.4233442Z         "", 
2021-11-02T18:48:28.4236076Z         "----------------------------------------------------------------------", 
2021-11-02T18:48:28.4238485Z         "Ran 1 test in 683.654s", 
2021-11-02T18:48:28.4240612Z         "", 
2021-11-02T18:48:28.4242667Z         "FAILED (failures=1)"
2021-11-02T18:48:28.4244693Z     ],

Describe the results you expected:

Downtime less than 30s and test should pass

Output of show version:

SONiC Software Version: SONiC.20201231.41
Distribution: Debian 10.11
Kernel: 4.19.0-12-2-amd64
Build commit: 84eefd6578
Build date: Sat Oct 30 12:17:22 UTC 2021
Built by: cloudtest@3cfd51cec000000

Platform: x86_64-dellemc_z9332f_d1508-r0
HwSKU: DellEMC-Z9332f-O32
ASIC: broadcom
ASIC Count: 1
Serial Number: TH04CN21CET0004K0214
Uptime: 18:35:51 up  1:05,  2 users,  load average: 0.82, 0.70, 0.66

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Syslog for one port showing transition:

Sep 29 01:43:57.156051 str2-z9332f-05 INFO syncd#syncd: [none] SAI_API_PORT:brcm_sai_remove_port_serdes:6102 Serdes object remove port 26 serdes id 1a005700000000
Sep 29 01:43:57.157773 str2-z9332f-05 INFO syncd#syncd: [none] SAI_API_PORT:brcm_sai_create_port_serdes:6057 Serdes object create port 26 serdes id 1a005700000000
Sep 29 01:43:57.158289 str2-z9332f-05 NOTICE swss#orchagent: :- setPortSerdesAttribute: Created port serdes object 0x57000000000c90 for port 0x1000000000018
Sep 29 01:44:01.524204 str2-z9332f-05 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet12 oper state set from down to up
Sep 29 01:44:03.127634 str2-z9332f-05 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet12 oper state set from up to down
Sep 29 01:44:12.638249 str2-z9332f-05 NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet12 oper state set from down to up

Sairedis showing serdes programming after oper status goes up

2021-09-29.01:43:54.749598|n|port_state_change|[{"port_id":"oid:0x1000000000018","port_state":"SAI_PORT_OPER_STATUS_UP"}]|
2021-09-29.01:43:57.156574|c|SAI_OBJECT_TYPE_PORT_SERDES:oid:0x57000000000c90|SAI_PORT_SERDES_ATTR_PORT_ID=oid:0x1000000000018|SAI_PORT_SERDES_ATTR_TX_FIR_PRE1=2:4294967268,4294967268|SAI_PORT_SERDES_ATTR_TX_FIR_PRE2=2:4,4|
SAI_PORT_SERDES_ATTR_TX_FIR_MAIN=2:136,136|SAI_PORT_SERDES_ATTR_TX_FIR_POST1=2:0,0|SAI_PORT_SERDES_ATTR_TX_FIR_POST2=2:0,0|SAI_PORT_SERDES_ATTR_TX_FIR_POST3=2:0,0
2021-09-29.01:43:57.484810|n|port_state_change|[{"port_id":"oid:0x1000000000018","port_state":"SAI_PORT_OPER_STATUS_DOWN"}]|
2021-09-29.01:43:59.891788|n|port_state_change|[{"port_id":"oid:0x1000000000018","port_state":"SAI_PORT_OPER_STATUS_UP"}]|
@tjchadaga
Copy link
Contributor Author

Issue is resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant