Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis] monit container checker status fails on supervisor card with not all SFM's present #8520

Closed
sanmalho-git opened this issue Aug 18, 2021 · 2 comments
Labels
Chassis 🤖 Modular chassis support Help Wanted 🆘 Triaged this issue has been triaged

Comments

@sanmalho-git
Copy link

Description

On a supervisor card in a VoQ chassis, we create syncd/teamd/swss/lldp etc dockers for each Switch Fabric card. However, not all chassis would have all the switch fabric cards present. In this case, only dockers for Switch Fabrics present would be created.

The monit 'container_checker' fails in this scenario as it is expecting dockers for all Switch Fabrics (possibly based on NUM_ASIC defined in asic.conf file).

Steps to reproduce the issue:

  1. On a supervisor card in a VoQ chassis, issue the command 'sudo monit' status
  2. Check the syslog for err messages related to monit container_checker.

Describe the results you received:

admin@sonic:~$ sudo monit status
Monit 5.20.0 uptime: 1h 3m
.
.
Program 'container_checker'
  status                       Status failed
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  last exit value              3
  last output                  Expected containers not running: syncd10, swss5, teamd2, syncd13, lldp12, swss12, swss4, teamd9, syncd5, teamd3, syncd3, lldp3, syncd12, swss2, swss13, lldp9, syncd2, lldp11, teamd5, teamd11, teamd4, l
  data collected               Wed, 18 Aug 2021 14:57:06

Error messages seen in syslog look like:

Aug 18 14:58:07.025538 sonic ERR monit[691]: 'container_checker' status failed (3) -- Expected containers not running: teamd13, teamd2, swss2, syncd5, teamd9, teamd4, teamd8, lldp8, teamd12, syncd2, teamd5, lldp11, lldp13, teamd3, swss9, syncd4, syncd8, swss12, swss13, swss8, teamd11, lldp12, lldp4, swss5, lldp3, swss4, swss3, lldp9, syncd9, lldp10, syncd3, syncd11, teamd10, syncd13, lldp2, syncd12, lldp5, swss10, syncd10, swss11

Describe the results you expected:

Output of show version:

(paste your output here)

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@zhangyanzhao zhangyanzhao added the Triaged this issue has been triaged label Mar 30, 2022
@rlhui rlhui added the Chassis 🤖 Modular chassis support label May 24, 2022
@rlhui rlhui assigned SuvarnaMeenakshi and unassigned rlhui May 24, 2022
@rlhui
Copy link
Contributor

rlhui commented May 25, 2022

@anamehra - would you be able to help address this? Thanks.

@sanmalho-git
Copy link
Author

Closing this as this should be addressed by #10170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support Help Wanted 🆘 Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

4 participants