Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mellanox] SONiC fails to start on t0 #4367

Closed
nazariig opened this issue Apr 3, 2020 · 1 comment · Fixed by #4375
Closed

[mellanox] SONiC fails to start on t0 #4367

nazariig opened this issue Apr 3, 2020 · 1 comment · Fixed by #4375

Comments

@nazariig
Copy link
Collaborator

nazariig commented Apr 3, 2020

Description

The latest 201911 branch is broken: SONiC fails to start on t0 topo.
The latest stable version is: SONiC-OS-HEAD.62-a5a11f6e
Looks like the culprit is: #3888

Steps to reproduce the issue:

  1. Install latest 201911 image
  2. Deploy t0
  3. Reload config

Describe the results you received:

root@sonic:/home/admin# show ip interfaces
Interface        Master    IPv4 address/mask    Admin/Oper    BGP Neighbor    Neighbor IP
---------------  --------  -------------------  ------------  --------------  -------------
Loopback0                  10.1.0.32/32         up/up         N/A             N/A
PortChannel0001            10.0.0.56/31         up/up         ARISTA01T1      10.0.0.57
PortChannel0002            10.0.0.58/31         up/up         ARISTA02T1      10.0.0.59
PortChannel0003            10.0.0.60/31         up/up         ARISTA03T1      10.0.0.61
PortChannel0004            10.0.0.62/31         up/up         ARISTA04T1      10.0.0.63
docker0                    240.127.1.1/24       up/down       N/A             N/A
eth0                       10.210.25.3/22       up/up         N/A             N/A
lo                         127.0.0.1/8          up/up         N/A             N/A

root@sonic:/home/admin# docker exec -ti swss bash
root@sonic:/# supervisorctl status
arp_update                       RUNNING   pid 149, uptime 0:00:46
buffermgrd                       RUNNING   pid 125, uptime 0:00:52
enable_counters                  RUNNING   pid 132, uptime 0:00:51
intfmgrd                         RUNNING   pid 101, uptime 0:00:54
nbrmgrd                          RUNNING   pid 135, uptime 0:00:50
neighsyncd                       RUNNING   pid 61, uptime 0:00:58
orchagent                        RUNNING   pid 37, uptime 0:01:01
portmgrd                         FATAL     Exited too quickly (process log may have details)
portsyncd                        RUNNING   pid 56, uptime 0:01:00
restore_neighbors                EXITED    Apr 03 07:01 PM
rsyslogd                         RUNNING   pid 32, uptime 0:01:03
start.sh                         EXITED    Apr 03 07:02 PM
supervisor-proc-exit-listener    RUNNING   pid 18, uptime 0:01:04
swssconfig                       EXITED    Apr 03 07:01 PM
vlanmgrd                         RUNNING   pid 86, uptime 0:00:55
vrfmgrd                          RUNNING   pid 73, uptime 0:00:57
vxlanmgrd                        RUNNING   pid 142, uptime 0:00:48
root@sonic:/# supervisorctl status portmgrd
portmgrd                         FATAL     Exited too quickly (process log may have details)

Describe the results you expected:

root@sonic:/home/admin# show ip interfaces
Interface        Master    IPv4 address/mask    Admin/Oper    BGP Neighbor    Neighbor IP
---------------  --------  -------------------  ------------  --------------  -------------
Loopback0                  10.1.0.32/32         up/up         N/A             N/A
PortChannel0001            10.0.0.56/31         up/up         ARISTA01T1      10.0.0.57
PortChannel0002            10.0.0.58/31         up/up         ARISTA02T1      10.0.0.59
PortChannel0003            10.0.0.60/31         up/up         ARISTA03T1      10.0.0.61
PortChannel0004            10.0.0.62/31         up/up         ARISTA04T1      10.0.0.63
Vlan1000                   192.168.0.1/21       up/up         N/A             N/A
docker0                    240.127.1.1/24       up/down       N/A             N/A
eth0                       10.210.25.3/22       up/up         N/A             N/A
lo                         127.0.0.1/8          up/up         N/A             N/A

root@sonic:/home/admin# docker exec -ti swss bash
root@sonic:/# supervisorctl status
arp_update                       RUNNING   pid 142, uptime 0:02:39
buffermgrd                       RUNNING   pid 126, uptime 0:02:45
enable_counters                  RUNNING   pid 129, uptime 0:02:44
intfmgrd                         RUNNING   pid 104, uptime 0:02:48
nbrmgrd                          RUNNING   pid 132, uptime 0:02:42
neighsyncd                       RUNNING   pid 61, uptime 0:02:52
orchagent                        RUNNING   pid 37, uptime 0:02:55
portmgrd                         RUNNING   pid 123, uptime 0:02:47
portsyncd                        RUNNING   pid 56, uptime 0:02:54
restore_neighbors                EXITED    Apr 03 06:54 PM
rsyslogd                         RUNNING   pid 32, uptime 0:02:57
start.sh                         EXITED    Apr 03 06:54 PM
supervisor-proc-exit-listener    RUNNING   pid 18, uptime 0:02:58
swssconfig                       EXITED    Apr 03 06:54 PM
vlanmgrd                         RUNNING   pid 86, uptime 0:02:49
vrfmgrd                          RUNNING   pid 72, uptime 0:02:50
vxlanmgrd                        RUNNING   pid 135, uptime 0:02:40
root@sonic:/# supervisorctl status portmgrd
portmgrd                         RUNNING   pid 123, uptime 0:03:39

Additional information you deem important (e.g. issue happens only occasionally):

Output of show version:

SONiC Software Version: SONiC.HEAD.63-aa30030f
Distribution: Debian 9.12
Kernel: 4.9.0-11-2-amd64
Build commit: aa30030f
Build date: Fri Apr  3 04:00:37 UTC 2020
Built by: johnar@jenkins-worker-8

Platform: x86_64-mlnx_msn3700c-r0
HwSKU: ACS-MSN3700C
ASIC: mellanox
Uptime: 18:42:01 up 8 min,  1 user,  load average: 3.40, 3.11, 1.68

Docker images:
REPOSITORY                    TAG                 IMAGE ID            SIZE
docker-syncd-mlnx             HEAD.63-aa30030f    1af78a807fb8        382MB
docker-syncd-mlnx             latest              1af78a807fb8        382MB
docker-router-advertiser      HEAD.63-aa30030f    dfd40ee96097        283MB
docker-router-advertiser      latest              dfd40ee96097        283MB
docker-sonic-mgmt-framework   HEAD.63-aa30030f    9b9e7bd7c9c2        420MB
docker-sonic-mgmt-framework   latest              9b9e7bd7c9c2        420MB
docker-platform-monitor       HEAD.63-aa30030f    eb66728b0b6d        628MB
docker-platform-monitor       latest              eb66728b0b6d        628MB
docker-fpm-frr                HEAD.63-aa30030f    56be72464acf        327MB
docker-fpm-frr                latest              56be72464acf        327MB
docker-sflow                  HEAD.63-aa30030f    3c9b99d175c1        307MB
docker-sflow                  latest              3c9b99d175c1        307MB
docker-lldp-sv2               HEAD.63-aa30030f    fd4fdd3e6f73        304MB
docker-lldp-sv2               latest              fd4fdd3e6f73        304MB
docker-dhcp-relay             HEAD.63-aa30030f    f6c7bc2d4a67        293MB
docker-dhcp-relay             latest              f6c7bc2d4a67        293MB
docker-database               HEAD.63-aa30030f    42aa405e949f        283MB
docker-database               latest              42aa405e949f        283MB
docker-teamd                  HEAD.63-aa30030f    0251b0682e04        307MB
docker-teamd                  latest              0251b0682e04        307MB
docker-snmp-sv2               HEAD.63-aa30030f    65741245707e        340MB
docker-snmp-sv2               latest              65741245707e        340MB
docker-orchagent              HEAD.63-aa30030f    440f378fcfe2        325MB
docker-orchagent              latest              440f378fcfe2        325MB
docker-nat                    HEAD.63-aa30030f    7d31d76aa298        309MB
docker-nat                    latest              7d31d76aa298        309MB
docker-sonic-telemetry        HEAD.63-aa30030f    78c16dcf3cc3        344MB
docker-sonic-telemetry        latest              78c16dcf3cc3        344MB

Attach debug file sudo generate_dump:

Apr  3 18:37:15.144592 sonic ERR swss#intfmgrd: :- exec: /sbin/ip address "add" "192.168.0.1/21" broadcast "192.168.7.255" dev "Vlan1000": Success
Apr  3 18:37:15.144643 sonic ERR swss#intfmgrd: :- setIntfIp: Command '/sbin/ip address "add" "192.168.0.1/21" broadcast "192.168.7.255" dev "Vlan1000"' failed with rc 256
Apr  3 18:37:16.412159 sonic ERR swss#portmgrd: :- exec: /sbin/ip link set dev "Ethernet0" mtu "9100": Success
Apr  3 18:37:16.412300 sonic ERR swss#portmgrd: :- main: Runtime error: /sbin/ip link set dev "Ethernet0" mtu "9100" :
Apr  3 18:37:16.703704 sonic INFO swss#supervisord: start.sh portmgrd: ERROR (spawn error)
Apr  3 18:37:17.995162 sonic ERR swss#portmgrd: :- exec: /sbin/ip link set dev "Ethernet0" mtu "9100": Success
Apr  3 18:37:17.995162 sonic ERR swss#portmgrd: :- main: Runtime error: /sbin/ip link set dev "Ethernet0" mtu "9100" :
Apr  3 18:37:20.732584 sonic ERR swss#portmgrd: :- exec: /sbin/ip link set dev "Ethernet0" mtu "9100": Success
Apr  3 18:37:20.732584 sonic ERR swss#portmgrd: :- main: Runtime error: /sbin/ip link set dev "Ethernet0" mtu "9100" :
Apr  3 18:37:24.589801 sonic ERR swss#portmgrd: :- exec: /sbin/ip link set dev "Ethernet0" mtu "9100": Success
Apr  3 18:37:24.589864 sonic ERR swss#portmgrd: :- main: Runtime error: /sbin/ip link set dev "Ethernet0" mtu "9100" :
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants