Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncd and orchagent crash when warm-reboot command is exected via ssh #3238

Open
nazeerhussainf opened this issue Jul 30, 2019 · 3 comments
Open
Assignees

Comments

@nazeerhussainf
Copy link

Description

When a warm-reboot command is issued via ssh, orchagent and syncd crash occurs and switch doesn't go for a warm-reboot.
This crash happens on Dell Z9100 platform.
Attached show-tech, config_db.json and core file in the attachment(warm-reboot_crash.zip)

johnar@83f6110999f3:~/sonic-mgmt/ansible$ ssh -o StrictHostKeyChecking=no root@10.11.150.14 sudo warm-reboot
Warning: Permanently added '10.11.150.14' (RSA) to the list of known hosts.
root@10.11.150.14's password:

johnar@83f6110999f3:~/sonic-mgmt/ansible$

root@sonic-z9100-02:/var/core# ls -al
total 10840
drwxr-xr-x 1 root root 4096 Jul 30 13:49 .
drwxr-xr-x 1 root root 4096 Jul 30 13:48 ..
-rw-r--r-- 1 root root 201814 Jul 30 13:50 9100-tech-support_30_Jul.txt
-rw-rw-rw- 1 root root 258930 Jul 30 13:47 orchagent.1564494434.77.core.gz
-rw-rw-rw- 1 root root 258544 Jul 30 13:49 orchagent.1564494555.79.core.gz
-rw-rw-rw- 1 root root 10360227 Jul 30 13:45 syncd.1564494331.41.core.gz
root@sonic-z9100-02:/var/core#

Steps to reproduce the issue:
Execute warm-reboot command via ssh.
johnar@83f6110999f3:~/sonic-mgmt/ansible$ ssh -o StrictHostKeyChecking=no root@10.11.150.14 sudo warm-reboot
Warning: Permanently added '10.11.150.14' (RSA) to the list of known hosts.
root@10.11.150.14's password:

johnar@83f6110999f3:~/sonic-mgmt/ansible$

Describe the results you received:
Orchagent and syncd crash occurred.

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):
warm-reboot_crash.zip

**Output of `show version`:**

root@sonic-z9100-02:/var/core# show version

SONiC Software Version: SONiC.HEAD.48-7271fe59
Distribution: Debian 9.9
Kernel: 4.9.0-9-2-amd64
Build commit: 7271fe5
Build date: Tue Jul 30 08:08:32 UTC 2019
Built by: johnar@jenkins-worker-4

Platform: x86_64-dell_z9100_c2538-r0
HwSKU: Force10-Z9100-C8D48
ASIC: broadcom
Serial Number: CN0GTX3X7793151D0003
Uptime: 13:48:07 up 14 min, 1 user, load average: 3.04, 2.23, 1.25

Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm HEAD.48-7271fe59 0c572a842328 391MB
docker-syncd-brcm latest 0c572a842328 391MB
docker-fpm-frr HEAD.48-7271fe59 241b6c171bd1 319MB
docker-fpm-frr latest 241b6c171bd1 319MB
docker-lldp-sv2 HEAD.48-7271fe59 cc364572e75f 298MB
docker-lldp-sv2 latest cc364572e75f 298MB
docker-dhcp-relay HEAD.48-7271fe59 8902f40c7d3c 289MB
docker-dhcp-relay latest 8902f40c7d3c 289MB
docker-database HEAD.48-7271fe59 39ac0d69edbd 281MB
docker-database latest 39ac0d69edbd 281MB
docker-snmp-sv2 HEAD.48-7271fe59 70c198b1b574 312MB
docker-snmp-sv2 latest 70c198b1b574 312MB
docker-orchagent HEAD.48-7271fe59 8d72459f0c6d 321MB
docker-orchagent latest 8d72459f0c6d 321MB
docker-teamd HEAD.48-7271fe59 27eb84cbe66e 302MB
docker-teamd latest 27eb84cbe66e 302MB
docker-sonic-telemetry HEAD.48-7271fe59 53e45bb45e96 304MB
docker-sonic-telemetry latest 53e45bb45e96 304MB
docker-router-advertiser HEAD.48-7271fe59 d8e09119e7ee 281MB
docker-router-advertiser latest d8e09119e7ee 281MB
docker-platform-monitor HEAD.48-7271fe59 eaba4fbb3cf3 325MB
docker-platform-monitor latest eaba4fbb3cf3 325MB

root@sonic-z9100-02:

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
@yxieca
Copy link
Contributor

yxieca commented Sep 12, 2019

@nazeerhussainf SONiC image doesn't have root account login enabled by default. We use admin account for access and management. And we do ssh into the sonic and issue "sudo warm-reboot" regularly not having issue.

Can you try to run this test with admin account instead?

@yxieca yxieca self-assigned this Sep 12, 2019
@yxieca
Copy link
Contributor

yxieca commented Nov 5, 2019

@nazeerhussainf if you can still repeat the issue. can you attach syslog instead of core?

@rlhui
Copy link
Contributor

rlhui commented May 27, 2020

potential duplicate of #3934

mssonicbld added a commit that referenced this issue Mar 27, 2024
…atically (#18474)

#### Why I did it
src/sonic-utilities
```
* faffd73d - (HEAD -> 202311, origin/202311) Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (4 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mssonicbld added a commit that referenced this issue Mar 28, 2024
…atically (#18240)

#### Why I did it
src/sonic-utilities
```
* bdc57206 - (HEAD -> master, origin/master, origin/HEAD) Revert "Fix for Switch Port Modes and VLAN CLI Enhancement (#3108)" (#3246) (89 minutes ago) [jingwenxie]
* e35452b7 - Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (2 days ago) [mihirpat1]
* 04a33e1f - Add "state" field in CONFIG_DB a toggle of the fabric port monitor feature (#2932) (2 days ago) [jfeng-arista]
* 3c489ba5 - Enhance route-check for multi-asic platforms (#3216) (5 days ago) [Deepak Singhal]
* c149e48b - [chassis] Add chassis support for CLI "config qos reload" (#3233) (6 days ago) [wenyiz2021]
* d8541add - Update port2alias (#3217) (8 days ago) [abdosi]
* d4688a8f - [graceful reboot] Add the pre_reboot_hook script execution, add the watchdog arm before the reboot (#3203) (8 days ago) [Vadym Hlushko]
* 125f36f3 - [ipintutil]Handle exception in show ip interfaces command (#3182) (10 days ago) [Sudharsan Dhamal Gopalarathnam]
* 9d532017 - [chassis][show-runningconfig] Fix the show runningconfiguration all issue on the Supervisor (#3194) (2 weeks ago) [Marty Y. Lok]
* 1a9261ce - [Techsupport]Handle SAI kv pair if present in sai common profile (#3196) (2 weeks ago) [Sudharsan Dhamal Gopalarathnam]
* 7466dc4a - Skip the validation of action in acl-loader if capability table in STATE_DB is empty (#3199) (2 weeks ago) [bingwang-ms]
* b879b658 - [Bug] Fix fw_setenv illegel character issue (#3201) (3 weeks ago) [xumia]
* 0b41a560 - [config] Add YANG alerting for override (#3188) (3 weeks ago) [jingwenxie]
* 24683b0c - [show] multi-asic show running test residue (#3198) (3 weeks ago) [jingwenxie]
* 995a797a - CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (3 weeks ago) [mihirpat1]
* 9aa9eaa5 - [config] Add Table hard dependency check (#3159) (3 weeks ago) [jingwenxie]
* 5f0ffcca - [fast/warm-reboot] Put ERR message in syslog when a failure is seen (#3186) (4 weeks ago) [Vaibhav Hemant Dixit]
* 92220dcf - Fix for Switch Port Modes and VLAN CLI Enhancement (#3108) (4 weeks ago) [Saba Akram]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mssonicbld added a commit that referenced this issue Apr 4, 2024
…atically (#18561)

#### Why I did it
src/sonic-utilities
```
* 7c721e92 - (HEAD -> 202305, origin/202305) Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (10 hours ago) [mihirpat1]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants