Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMON xcvrd crash with master image 498 #5994

Closed
yxieca opened this issue Nov 21, 2020 · 4 comments
Closed

PMON xcvrd crash with master image 498 #5994

yxieca opened this issue Nov 21, 2020 · 4 comments
Assignees

Comments

@yxieca
Copy link
Contributor

yxieca commented Nov 21, 2020

Description

Steps to reproduce the issue:

  1. Nightly test

Describe the results you received:
PMON crash causing all test to fail sanity check

Describe the results you expected:
Run nightly test

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

```

SONiC Software Version: SONiC.master.498-5fbb6ee6
Distribution: Debian 10.6
Kernel: 4.19.0-9-2-amd64
Build commit: 5fbb6ee
Build date: Fri Nov 20 14:36:29 UTC 2020
Built by: johnar@jenkins-worker-3
```

**Attach debug file `sudo generate_dump`:**
Nov 21 18:10:26.955052 str2-7050cx3-acs-07 ERR pmon#ledd[32]: :- initializeGlobalConfig: Sonic database config global file doesn't exist at /var/run/redis/sonic-db/database_global.json
Nov 21 18:10:29.024274 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd Traceback (most recent call last):
Nov 21 18:10:29.024274 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/bin/xcvrd", line 8, in <module>
Nov 21 18:10:29.024274 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     sys.exit(main())
Nov 21 18:10:29.024274 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd.py", line 1318, in main
Nov 21 18:10:29.024738 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     xcvrd.run()
Nov 21 18:10:29.024801 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd.py", line 1268, in run
Nov 21 18:10:29.024842 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     self.init()
Nov 21 18:10:29.024883 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd.py", line 1241, in init
Nov 21 18:10:29.025171 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     y_cable_helper.init_ports_status_for_y_cable(platform_sfputil, platform_chassis, self.y_cable_presence, self.stop_event)
Nov 21 18:10:29.025324 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd_utilities/y_cable_helper.py", line 295, in init_ports_status_for_y_cable
Nov 21 18:10:29.025324 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     state_db, port_tbl, y_cable_tbl, asic_index, logical_port_name, y_cable_presence)
Nov 21 18:10:29.025344 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd_utilities/y_cable_helper.py", line 224, in check_identifier_presence_and_update_mux_table_entry
Nov 21 18:10:29.025344 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     logical_port_name, y_cable_tbl[asic_index])
Nov 21 18:10:29.025370 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/xcvrd/xcvrd_utilities/y_cable_helper.py", line 169, in update_statedb_port_mux_status_table
Nov 21 18:10:29.025370 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     read_side = y_cable.check_read_side(physical_port)
Nov 21 18:10:29.025386 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd   File "/usr/local/lib/python2.7/dist-packages/sonic_y_cable/y_cable.py", line 149, in check_read_side
Nov 21 18:10:29.025399 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd     regval_read = struct.unpack(">B", result)
Nov 21 18:10:29.025399 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd struct.error: 
Nov 21 18:10:29.025871 str2-7050cx3-acs-07 INFO pmon#supervisord: xcvrd unpack requires a string argument of length 1
Nov 21 18:10:30.078791 str2-7050cx3-acs-07 WARNING pmon#psud[35]: Power absence warning: psu1 is out of power.
Nov 21 18:10:30.103103 str2-7050cx3-acs-07 INFO pmon#supervisor-proc-exit-listener: Process xcvrd exited unxepectedly. Terminating supervisor...
@jleveque
Copy link
Contributor

@vdahiya12: This seems to be related to your recent changes. Can you please investigate?

@vdahiya12
Copy link
Contributor

vdahiya12 commented Nov 22, 2020

the mux_cable tag seems to be introduced for non y_cable ports, this should not happen if the port is not on a Credo y_cable
admin@str2-7050cx3-acs-07:~$ redis-cli -n 4 hgetall "PORT|Ethernet112"

  1. "index"
  2. "29"
  3. "lanes"
  4. "117,118,119,120"
  5. "fec"
  6. "rs"
  7. "admin_status"
  8. "up"
  9. "mtu"
  10. "9100"
  11. "alias"
  12. "Ethernet29/1"
  13. "pfc_asym"
  14. "off"
  15. "speed"
  16. "100000"

17) "mux_cable"

18) "true"

  1. "description"
  2. "ARISTA01T1:Ethernet2"
    this PR should fix the crash however
    [sonic_y_cable] support for error handling non-expected eeprom_read return values inside sonic_y_cable sonic-platform-common#147

@yxieca
Copy link
Contributor Author

yxieca commented Nov 25, 2020

image 502 still have the same crash

@yxieca
Copy link
Contributor Author

yxieca commented Nov 25, 2020

The fix is not in 502. build 503 has the fix.

@daall daall closed this as completed Dec 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants