Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VoQ][config] Multiasic Supervisor card fails to load config_db#.json in chassis when system reboot #10105

Closed
mlok-nokia opened this issue Feb 28, 2022 · 2 comments · Fixed by #10106
Assignees
Labels
Chassis 🤖 Modular chassis support NOKIA Triaged this issue has been triaged

Comments

@mlok-nokia
Copy link
Contributor

Description

Supervisor card fails to load config_db#.json in chassis when system reboot. The Supervisor card which has 16 ASICs, when it is reboot, the first one or two Instance fail to load the config_db#.json due to unavailable of some instance /var/run/redis#/sonic-db/database_config.json file. This is an intermittent issue.

Feb 28 22:15:16.576986 supervisor INFO database.sh[6624]: Traceback (most recent call last):
Feb 28 22:15:16.577157 supervisor INFO database.sh[6624]:   File "/usr/local/bin/sonic-cfggen", line 445, in <module>
Feb 28 22:15:16.577598 supervisor INFO database.sh[6624]:     main()
Feb 28 22:15:16.577742 supervisor INFO database.sh[6624]:   File "/usr/local/bin/sonic-cfggen", line 430, in main
Feb 28 22:15:16.578163 supervisor INFO database.sh[6624]:     SonicDBConfig.load_sonic_global_db_config(namespace=args.namespace)
Feb 28 22:15:16.578314 supervisor INFO database.sh[6624]:   File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1249, in load_sonic_global_db_config
Feb 28 22:15:16.578786 supervisor INFO database.sh[6624]:     SonicDBConfig.initializeGlobalConfig(global_db_file_path)
Feb 28 22:15:16.578927 supervisor INFO database.sh[6624]:   File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1244, in initializeGlobalConfig
Feb 28 22:15:16.579382 supervisor INFO database.sh[6624]:     return _swsscommon.SonicDBConfig_initializeGlobalConfig(*args, **kwargs)
Feb 28 22:15:16.579524 supervisor INFO database.sh[6624]: RuntimeError: Sonic database config file syntax error >> Sonic database config file syntax error >> parse error - unexpected end of input

Steps to reproduce the issue:

  1. Reboot a VOQ multiasic Supervisor card with all proper config_db#.json file
  2. Execute the show interfaces status command. The following are shown
admin@supervisor:$ show interfaces status
Traceback (most recent call last):
  File "/usr/local/bin/intfutil", line 702, in <module>
    main()
  File "/usr/local/bin/intfutil", line 688, in main
    interface_stat.display_intf_status()
  File "/usr/local/bin/intfutil", line 377, in display_intf_status
    self.get_intf_status()
  File "/usr/local/lib/python3.9/dist-packages/utilities_common/multi_asic.py", line 137, in wrapped_run_on_all_asics
    ns_list = self.multi_asic.get_ns_list_based_on_options()
  File "/usr/local/lib/python3.9/dist-packages/utilities_common/multi_asic.py", line 63, in get_ns_list_based_on_options
    namespaces = multi_asic.get_all_namespaces()
  File "/usr/local/lib/python3.9/dist-packages/sonic_py_common/multi_asic.py", line 243, in get_all_namespaces
    if metadata['localhost']['sub_role'] == FRONTEND_ASIC_SUB_ROLE:
KeyError: 'localhost'
admin@supervisor$ 
  1. The following log messages can be found in the syslog file
    ...
    Feb 28 22:15:16.576986 supervisor INFO database.sh[6624]: Traceback (most recent call last):
    Feb 28 22:15:16.577157 supervisor INFO database.sh[6624]: File "/usr/local/bin/sonic-cfggen", line 445, in
    Feb 28 22:15:16.577598 supervisor INFO database.sh[6624]: main()
    Feb 28 22:15:16.577742 supervisor INFO database.sh[6624]: File "/usr/local/bin/sonic-cfggen", line 430, in main
    Feb 28 22:15:16.578163 supervisor INFO database.sh[6624]: SonicDBConfig.load_sonic_global_db_config(namespace=args.namespace)
    Feb 28 22:15:16.578314 supervisor INFO database.sh[6624]: File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1249, in load_sonic_global_db_config
    Feb 28 22:15:16.578786 supervisor INFO database.sh[6624]: SonicDBConfig.initializeGlobalConfig(global_db_file_path)
    Feb 28 22:15:16.578927 supervisor INFO database.sh[6624]: File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1244, in initializeGlobalConfig
    Feb 28 22:15:16.579382 supervisor INFO database.sh[6624]: return _swsscommon.SonicDBConfig_initializeGlobalConfig(*args, **kwargs)
    Feb 28 22:15:16.579524 supervisor INFO database.sh[6624]: RuntimeError: Sonic database config file syntax error >> Sonic database config file syntax error >> parse error - unexpected end of input
    ...

Describe the results you received:

The output of the "show interfaces status" as below
admin@supervisor:$ show interfaces status
Traceback (most recent call last):
File "/usr/local/bin/intfutil", line 702, in
main()
File "/usr/local/bin/intfutil", line 688, in main
interface_stat.display_intf_status()
File "/usr/local/bin/intfutil", line 377, in display_intf_status
self.get_intf_status()
File "/usr/local/lib/python3.9/dist-packages/utilities_common/multi_asic.py", line 137, in wrapped_run_on_all_asics
ns_list = self.multi_asic.get_ns_list_based_on_options()
File "/usr/local/lib/python3.9/dist-packages/utilities_common/multi_asic.py", line 63, in get_ns_list_based_on_options
namespaces = multi_asic.get_all_namespaces()
File "/usr/local/lib/python3.9/dist-packages/sonic_py_common/multi_asic.py", line 243, in get_all_namespaces
if metadata['localhost']['sub_role'] == FRONTEND_ASIC_SUB_ROLE:
KeyError: 'localhost'
admin@supervisor$

Describe the results you expected:

admin@supervisor:~$ show interfaces status
  Interface    Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin    Type    Asym PFC
-----------  -------  -------  -----  -----  -------  ------  ------  -------  ------  ----------

Output of show version:

The issue still exists in the latest as 02/28/2021

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@mlok-nokia mlok-nokia changed the title [config] Supervisor card fails to load config_db#.json in chassis when system reboot [VoQ][config] Supervisor card fails to load config_db#.json in chassis when system reboot Feb 28, 2022
@mlok-nokia mlok-nokia changed the title [VoQ][config] Supervisor card fails to load config_db#.json in chassis when system reboot [VoQ][config] Multiasic Supervisor card fails to load config_db#.json in chassis when system reboot Feb 28, 2022
@zhangyanzhao
Copy link
Collaborator

@dflynn-Nokia will find someone in Nokia to take a look. Thanks.

@zhangyanzhao zhangyanzhao added the Triaged this issue has been triaged label Mar 2, 2022
@zhangyanzhao
Copy link
Collaborator

@abdosi @judyjoseph please help to take a look.

@rlhui rlhui added the Chassis 🤖 Modular chassis support label Mar 2, 2022
judyjoseph pushed a commit that referenced this issue May 9, 2022
… in chassis when system is reboot (#10106)

Supervisor card fails to load config_db#.json in chassis when system reboot. 
This is an intermittent issue, fixes #10105
judyjoseph pushed a commit that referenced this issue May 16, 2022
… in chassis when system is reboot (#10106)

Supervisor card fails to load config_db#.json in chassis when system reboot. 
This is an intermittent issue, fixes #10105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support NOKIA Triaged this issue has been triaged
Projects
None yet
3 participants