Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VOQ] syncd and swss didn't come up on sup #11892

Closed
ysmanman opened this issue Aug 30, 2022 · 10 comments
Closed

[VOQ] syncd and swss didn't come up on sup #11892

ysmanman opened this issue Aug 30, 2022 · 10 comments
Assignees
Labels
Chassis 🤖 Modular chassis support MSFT P0 Priority of the issue Triaged this issue has been triaged

Comments

@ysmanman
Copy link
Contributor

Description

When testing recent master and also 202205 image, we found that syncd and swss didn't come up on sup.

Our debug indicates syncd and also swss were stuck in asic_status.py https://github.com/sonic-net/sonic-buildimage/blob/master/files/scripts/asic_status.py. Take asic3 as an example:

$ systemctl status syncd@3
syncd@3.service - syncd service
Loaded: loaded (/lib/systemd/system/syncd@.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/syncd@3.service.d
└─auto_restart.conf
Active: active (running) since Tue 2022-08-30 04:43:23 UTC; 13min ago
Main PID: 11848 (syncd.sh)
Tasks: 2 (limit: 77116)
Memory: 12.0M
CGroup: /system.slice/system-syncd.slice/syncd@3.service
├─11848 /bin/bash /usr/local/bin/syncd.sh wait 3
└─11859 python3 /usr/local/bin/asic_status.py syncd 3

We confirmed that asic was already published in chassis_db:

$ sonic-db-cli CHASSIS_STATE_DB hgetall "CHASSIS_ASIC_TABLE|asic3"
{'asic_pci_address': '0000:1a:00.0', 'name': 'FABRIC-CARD1', 'asic_id_in_module': '1'}

syncd3 came up fine after restarting the service (sudo systemctl restart syncd@3)

Steps to reproduce the issue:

  1. Install master or 202205 image
  2. Check if syncd and swss are coming up with 'docker ps'.

Describe the results you received:

Describe the results you expected:

Output of show version:

(paste your output here)

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@ysmanman
Copy link
Contributor Author

Add @arlakshm for viz.

@rlhui rlhui added the Chassis 🤖 Modular chassis support label Aug 30, 2022
@zhangyanzhao zhangyanzhao added Triaged this issue has been triaged MSFT labels Aug 31, 2022
@rlhui rlhui added the P0 Priority of the issue label Sep 9, 2022
@ysmanman
Copy link
Contributor Author

Hi @arlakshm, should #12540 fix this issue too?

@rlhui
Copy link
Contributor

rlhui commented Nov 20, 2022

@ysmanman - could you please confirm if issue is now fixed or still there, as #12540 is merged and in 202205.

1 similar comment
@rlhui
Copy link
Contributor

rlhui commented Nov 20, 2022

@ysmanman - could you please confirm if issue is now fixed or still there, as #12540 is merged and in 202205.

@anamehra
Copy link
Contributor

anamehra commented Dec 3, 2022

I hit this issue again with 202205 5c7c789

@ysmanman
Copy link
Contributor Author

ysmanman commented Dec 3, 2022

I hit this issue again with 202205 5c7c789

Hi @anamehra #12780 and sonic-net/sonic-platform-daemons#311 may fix the issue. Can you try latest 202205 image?

@anamehra
Copy link
Contributor

anamehra commented Dec 4, 2022

I hit this issue again with 202205 5c7c789

Hi @anamehra #12780 and sonic-net/sonic-platform-daemons#311 may fix the issue. Can you try latest 202205 image?

Thanks!
Looks like the two commits were out of sync. We got the changes for asic_status.py but this came in later.

We also need fix in multi_asic.py script to use FABRIC ASIC table to return get_asic_presence_list() for sup

@abdosi
@bmridul

@arlakshm
Copy link
Contributor

arlakshm commented Dec 4, 2022

@anamehra
Copy link
Contributor

anamehra commented Dec 4, 2022

@anamehra this change is done in # 12780 https://github.com/sonic-net/sonic-buildimage/pull/12780/files#diff-d124f90ae4ff343a55a35d310ec376445f5c9d81300bf3a091bbb7bbae995d01

Thanks! I found that. I looked at the files in master earlier, where it's not present.

@arlakshm
Copy link
Contributor

Fixed with PRs #12780 and sonic-net/sonic-platform-daemons#311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support MSFT P0 Priority of the issue Triaged this issue has been triaged
Projects
Archived in project
Development

No branches or pull requests

5 participants