-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xcvrd crash observed during boot #11707
xcvrd crash observed during boot #11707
Comments
@anamehra do you mind create a PR to address the issue since you already identified it? @vdahiya12 and/or @prgeor to review. |
Instead of fixing this crash in Xcvrd we should see why 'admin_status' field is missing in PORT table of CONFIG_DB. Xcvrd start running code only after Portconfigdone https://github.com/sonic-net/sonic-platform-daemons/blob/master/sonic-xcvrd/xcvrd/xcvrd.py#L1295 |
Hi Prince, what I observed from my debugging is that at the point of the crash, the data being used is from config db. for example, the Ethernet31 is configured for a far-end, and has this field populated while Ethernet32, as was not being used, has this field missing.
|
Description
During boot on Line cards, xcvrd crash is observed which cause the port optics init failure:
Aug 11 17:09:52.008708 sfd-vt2-lc0 INFO pmon#supervisord 2022-08-11 17:09:52,007 INFO success: xcvrd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
Aug 11 17:09:56.747253 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd ERROR: execvpe(/usr/sbin/smartctl) failed
Aug 11 17:09:56.747582 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd : [2] No such file or directory
Aug 11 17:09:56.751125 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd ERROR: command '/usr/sbin/smartctl' failed
Aug 11 17:09:56.751354 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd : [116] Stale file handle
Aug 11 17:10:57.423570 sfd-vt2-lc0 NOTICE pmon#xcvrd[121]: CMIS: Starting...
Aug 11 17:10:57.526003 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd Process Process-1:
Aug 11 17:10:57.526805 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd Traceback (most recent call last):
Aug 11 17:10:57.526805 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
Aug 11 17:10:57.526805 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd self.run()
Aug 11 17:10:57.527188 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
Aug 11 17:10:57.527204 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd self._target(*self._args, **self._kwargs)
Aug 11 17:10:57.527212 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1268, in task_worker
Aug 11 17:10:57.527212 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd self.port_dict[lport]['admin_status'] = self.get_port_admin_status(lport)
Aug 11 17:10:57.527223 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1226, in get_port_admin_status
Aug 11 17:10:57.527223 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd admin_status = dict(port_info)['admin_status']
Aug 11 17:10:57.527236 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd KeyError: 'admin_status'
Aug 11 17:10:57.527274 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd Starting
Aug 11 17:10:57.929726 sfd-vt2-lc0 INFO pmon#supervisord: xcvrd DBG _optics_init_once:OPTICS_INIT_ONCE: start one time optics lib initialization
Steps to reproduce the issue:
Describe the results you received:
Front panel ports failed to come oper up.
Describe the results you expected:
no xcvrd crash. port should come oper up
Output of
show version
:sha1 used to build the image:
Azure/sonic-buildimage-msft@b6bfd6a
Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered: