-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Z9264/201911-170/Orchagent and python crash with warm-reboot. Kernal crash seen often #5258
Comments
Issue in 201911. |
Need more information:
|
Seems like HW watchdog issue and Dell specific cc: @lguohan |
there is a hardware watchdog issue, that must be dell platform specific issue in the going down path.
|
Description:While performing warm-reboot and fast-reboot PTF testcases in T0 topology, python core is generated. This is consistently seen in post-August NOS image. The python corefile is generated in both T0 and T1 topologies when a simple ‘warm-reboot -vvv’ command is issued on the CLI. Platform :x86_64-delta_evs-a-32q56-r0 with T0 topology loaded. Build :root@sonic-k2svk26-dut:~# show ver SONiC Software Version: SONiC.SONiC201911_Innovium.0-dirty-20200820.233443 Distribution: Debian 9.13 Kernel: 4.9.0-11-2-amd64 Build commit: 427382b Build date: Fri Aug 21 09:40:12 UTC 2020 Built by: admin@serv706 Platform: x86_64-delta_evs-a-32q56-r0 HwSKU: Delta-evs-a-32q56_32x100 ASIC: innovium Serial Number: EVS32Q56DG2032006 Uptime: 05:05:50 up 1 min, 2 users, load average: 0.82, 0.29, 0.10 Docker images: REPOSITORY TAG IMAGE ID SIZE docker-syncd-invm-rpc latest a17c6050eace 1.24GB Steps to repro:With T1 or T0 config loaded in the box, perform a warm-reboot root@sonic:~# warm-reboot -vvv Tue Sep 15 05:08:48 PDT 2020 Pausing orchagent ... Tue Sep 15 05:08:48 PDT 2020 Stopping nat ... Dumping conntrack entries failed Error response from daemon: Cannot kill container: nat: No such container: nat Tue Sep 15 05:08:48 PDT 2020 Stopped nat ... Tue Sep 15 05:08:48 PDT 2020 Stopping radv ... Tue Sep 15 05:08:48 PDT 2020 Stopping bgp ... Tue Sep 15 05:08:49 PDT 2020 Stopped bgp ... Tue Sep 15 05:08:50 PDT 2020 Initialize pre-shutdown ... Tue Sep 15 05:08:51 PDT 2020 Requesting pre-shutdown ... Tue Sep 15 05:08:51 PDT 2020 Waiting for pre-shutdown ... Tue Sep 15 05:08:51 PDT 2020 Pre-shutdown succeeded ... Tue Sep 15 05:08:51 PDT 2020 Backing up database ... Tue Sep 15 05:08:52 PDT 2020 Stopping teamd ... Tue Sep 15 05:08:52 PDT 2020 Stopped teamd ... Tue Sep 15 05:08:52 PDT 2020 Stopping syncd ... Tue Sep 15 05:09:09 PDT 2020 Stopped syncd ... Tue Sep 15 05:09:09 PDT 2020 Stopping all remaining containers ... Warning: Stopping telemetry.service, but it can still be activated by: telemetry.timer Tue Sep 15 05:09:10 PDT 2020 Stopped all remaining containers ... Tue Sep 15 05:09:13 PDT 2020 Rebooting with /sbin/kexec -e to SONiC-OS-SONiC201911_Innovium.0-dirty-20200820.233443 ... root@sonic-k2svk26-dut: root@sonic-k2svk26-dut:~# [ OK ] Stopped syncd service. root@sonic-k2svk26-dut:~# ls -lt /var/core/ BACKTRACEroot@sonic-k2svk26-dut:~# gdb /usr/bin/python |
The calcmgrd core during warm-reboot is happening on our platform as well. |
2.0 API was invoked by watchdogutil but 2.0 API was not implemented in 201911 branch. This has caused the crash. #5637 will fix this issue. |
Description:
=============
While performing warm-reboot observe python core and orchagent core generated. This is consistently seen in latest NOv image-HEAD
Often we see a kernal crash as well. The logs are attached below.
Platform :
===========
DellEMC-Z9264f-C64 with T1 config loaded
Build :
======
root@sonic:~# show ver
SONiC Software Version: SONiC.HEAD.170-c243b8a9
Distribution: Debian 9.13
Kernel: 4.9.0-11-2-amd64
Build commit: c243b8a
Build date: Tue Aug 25 18:59:41 UTC 2020
Built by: johnar@jenkins-worker-7
Platform: x86_64-dellemc_z9264f_c3538-r0
HwSKU: DellEMC-Z9264f-C64
ASIC: broadcom
Serial Number: CN030K8MDND0093E0001
Uptime: 07:09:59 up 4 min, 1 user, load average: 0.96, 1.15, 0.55
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm HEAD.170-c243b8a9 8b26f1bf8730 436MB
Steps to repro:
===============
With T1 config loaded in the box, perform a warm-reboot
root@sonic:~# warm-reboot -vvv
Thu Aug 27 04:22:40 UTC 2020 Pausing orchagent ...
Thu Aug 27 04:22:41 UTC 2020 Stopping nat ...
Error response from daemon: Cannot kill container: nat: No such container: nat
Thu Aug 27 04:22:41 UTC 2020 Stopped nat ...
Thu Aug 27 04:22:41 UTC 2020 Stopping radv ...
Thu Aug 27 04:22:41 UTC 2020 Stopping bgp ...
Thu Aug 27 04:22:42 UTC 2020 Stopped bgp ...
Thu Aug 27 04:22:46 UTC 2020 Initialize pre-shutdown ...
Thu Aug 27 04:22:47 UTC 2020 Requesting pre-shutdown ...
Thu Aug 27 04:22:48 UTC 2020 Waiting for pre-shutdown ...
Thu Aug 27 04:22:54 UTC 2020 Pre-shutdown succeeded ...
Thu Aug 27 04:22:54 UTC 2020 Backing up database ...
Thu Aug 27 04:22:56 UTC 2020 Stopping teamd ...
Thu Aug 27 04:22:56 UTC 2020 Stopped teamd ...
Thu Aug 27 04:22:56 UTC 2020 Stopping syncd ...
Thu Aug 27 04:23:09 UTC 2020 Stopped syncd ...
Thu Aug 27 04:23:09 UTC 2020 Stopping all remaining containers ...
Warning: Stopping telemetry.service, but it can still be activated by:
telemetry.timer
Thu Aug 27 04:23:11 UTC 2020 Stopped all remaining containers ...
Thu Aug 27 04:23:14 UTC 2020 Enabling Watchdog before warm-reboot
Traceback (most recent call last):
File "/usr/bin/watchdogutil", line 8, in
from watchdogutil.main import watchdogutil
File "/usr/lib/python2.7/dist-packages/watchdogutil/main.py", line 15, in
raise ImportError("%s - required module not found" % str(e[ 414.310317] kexec_core: Starting new kernel
))
ImportError:[ 0.075766] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
[ 4.880792] DMAR: DRHD: handling fault status reg 2
[ 4.939137] DMAR: [INTR-REMAP] Request device [04:00.0] fault index 49 [fault reason 34] Present field in the IRTE entry is clear
[ 4.987163] rc.local[449]: + sonic-cfggen -y /etc/sonic/sonic_version.yml -v build_version
[ 5.406697] kdump-tools[484]: Starting kdump-tools: no crashkernel= parameter in the kernel cmdline ... failed!
[ 6.379721] rc.local[449]: + SONIC_VERSION=HEAD.170-c243b8a9
[ 6.456690] rc.local[449]: + FIRST_BOOT_FILE=/host/image-HEAD.170-c243b8a9/platform/firsttime
[ 6.574341] rc.local[449]: + logger SONiC version HEAD.170-c243b8a9 starting up...
[ 6.676308] rc.local[449]: + [ ! -e /host/machine.conf ]
[ 6.748245] rc.local[449]: + . /host/machine.conf
[ 6.808361] rc.local[449]: + onie_version=3.42.1.B-6
[ 6.868366] rc.local[449]: + onie_vendor_id=674
[ 6.936113] rc.local[449]: + onie_platform=x86_64-dellemc_z9264f_c3538-r0
[ 7.028385] rc.local[449]: + onie_machine=dellemc_z9264f_c3538
[ 7.104376] rc.local[449]: + onie_machine_rev=0
[ 7.164447] rc.local[449]: + onie_arch=x86_64
[ 7.224271] rc.local[449]: + onie_config_version=1
[ 7.490195] c3xxx 0000:01:00.0: Failed to send admin msg to accelerator
[ 7.569294] c3xxx 0000:01:00.0: Failed to send init message
[ 7.376463] rc.local[449]: + onie_build_date=2019-10-16T18:56-07:00
[ 7.516407] rc.local[449]: + onie_partition_type=gpt
[ 7.576399] rc.local[449]: + onie_kernel_version=4.9.30
[ 7.648386] rc.local[449]: + onie_firmware=auto
[ 7.708397] rc.local[449]: + onie_switch_asic=bcm
[ 7.768401] rc.local[449]: + onie_skip_ethmgmt_macs=no
[ 7.840400] rc.local[449]: + program_console_speed
[ 7.962119] rc.local[449]: + cat /proc/cmdline
[ 8.041394] rc.local[449]: + grep -Eo console=ttyS[0-9]+,[0-9]+
[ 8.120595] rc.local[449]: + cut -d , -f2
[ 8.176915] rc.local[449]: + speed=9600
[ 8.236417] rc.local[449]: + [ -z 9600 ]
[ 8.288459] rc.local[449]: + CONSOLE_SPEED=9600
[ 8.352416] rc.local[449]: + sed -i s|--keep-baud .* %I| 9600 %I|g /lib/systemd/system/serial-getty@.service
[ 8.476404] rc.local[449]: + systemctl daemon-reload
[ 8.536383] rc.local[449]: + [ -f /host/image-HEAD.170-c243b8a9/platform/firsttime ]
[ 8.640423] rc.local[449]: + [ -f /var/log/fsck.log.gz ]
[ 8.714123] rc.local[449]: + gunzip -d -c /var/log/fsck.log.gz
[ 8.788944] rc.local[449]: + logger -t FSCK
[ 8.848795] rc.local[449]: + rm -f /var/log/fsck.log.gz
[ 8.912388] rc.local[449]: + exit 0
Debian GNU/Linux 9 sonic ttyS0
sonic login: [ OK ] Started Update hostname based on configdb.
Stopping System Logging Service...
[ OK ] Stopped System Logging Service.
Starting System Logging Service...
[ OK ] Started System Logging Service.
[ OK ] Started Update rsyslog configuration.
[ OK ] Started BGP container.
[ OK ] Started Platform monitor container.
Stopping ifupdown2 networking initialization...
[ OK ] Stopped ifupdown2 networking initialization.
Starting ifupdown2 networking initialization...
[ OK ] Started ifupdown2 networking initialization.
[ OK ] Started Update interfaces configuration.
Starting switch state service...
[ OK ] Started switch state service.
Starting syncd service...
[ OK ] Started watchdog control service.
Starting TEAMD container...
[ OK ] Started TEAMD container.
[ OK ] Started syncd service.
Starting LLDP container...
Starting DHCP relay container...
Starting Management Framework container...
Starting Router advertiser container...
[ OK ] Started Router advertiser container.
[ OK ] Started Management Framework container.
[ OK ] Started DHCP relay container.
[ OK ] Started LLDP container.
Starting Update NTP configuration...
Stopping LSB: Start NTP daemon...
[ OK ] Stopped LSB: Start NTP daemon.
Starting LSB: Start NTP daemon...
[ OK ] Started LSB: Start NTP daemon.
[ OK ] Started Update NTP configuration.
Debian GNU/Linux 9 sonic ttyS0
sonic login: admin
Password:
Login incorrect
sonic login: admin
Password:
Last login: Thu Aug 27 04:20:20 UTC 2020 on ttyS0
Linux sonic 4.9.0-11-2-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64
You are on
/ | / _ | \ | ()/ |
_ | | | | | | | |
) | || | |\ | | |
|/ ___/|| _||____|
-- Software for Open Networking in the Cloud --
Unauthorized access and/or use are prohibited.
All access and/or use are subject to monitoring.
Help: http://azure.github.io/SONiC/
admin@sonic:
$$admin@sonic:
admin@sonic:
$$ sudo -iadmin@sonic:
root@sonic:
##root@sonic:
root@sonic:
##root@sonic:
root@sonic:
## ls -l /var/coreroot@sonic:
total 7908
-rw-rw-rw- 1 root root 8094020 Aug 27 04:23 python.1598502191.1255.core.gz
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
28bc2fb500f3 docker-sonic-telemetry:latest "/usr/bin/supervisord" 16 minutes ago Up 8 seconds telemetry
d63e7415c56a docker-lldp-sv2:latest "/usr/bin/docker-lld…" 20 minutes ago Up 3 minutes lldp
96a4a65bfc34 docker-router-advertiser:latest "/usr/bin/docker-ini…" 20 minutes ago Up 3 minutes radv
0f07e9f6aadc docker-sonic-mgmt-framework:latest "/usr/bin/supervisord" 20 minutes ago Up About a minute mgmt-framework
955fdf680ddf docker-dhcp-relay:latest "/usr/bin/docker_ini…" 20 minutes ago Up 3 minutes dhcp_relay
797ba772beec docker-syncd-brcm:latest "/usr/bin/supervisord" 20 minutes ago Up 3 minutes syncd
b0387b9a1d2b docker-teamd:latest "/usr/bin/supervisord" 20 minutes ago Up 3 minutes teamd
c9076d2c96da docker-fpm-frr:latest "/usr/bin/supervisord" 20 minutes ago Up 3 minutes bgp
28cb8500068a docker-platform-monitor:latest "/usr/bin/docker_ini…" 20 minutes ago Up 3 minutes pmon
3195be5b90ce docker-database:latest "/usr/local/bin/dock…" 20 minutes ago Up 3 minutes database
[ OK ] Stopped Delays snmp container until SONiC has started.
root@sonic:/var/core# ls -l
total 43816
-rw-rw-rw- 1 root root 277011 Aug 27 04:25 orchagent.1598502300.47.core.gz
-rw-rw-rw- 1 root root 276711 Aug 27 04:26 orchagent.1598502410.50.core.gz
-rw-rw-rw- 1 root root 277774 Aug 27 04:28 orchagent.1598502521.50.core.gz
-rw-rw-rw- 1 root root 276775 Aug 27 04:30 orchagent.1598502631.47.core.gz
-rw-rw-rw- 1 root root 43753472 Aug 27 04:23 python.1598502191.1255.core
root@sonic:/var/core# gdb python python.1598502191.1255.core
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...(no debugging symbols found)...done.
warning: core file may not match specified executable file.
[New LWP 1255]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `python /usr/bin/caclmgrd'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fd2a6f2afff in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007fd2a6f2afff in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fd2a6f2c42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fd2a31740ad in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fd2a3172066 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fd2a31720b1 in std::terminate() ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fd2a31722c9 in __cxa_throw ()
from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fd2a452e4ea in swss::SubscriberStateTable::readData() ()
from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#7 0x00007fd2a451e387 in swss::Select::poll_descriptors(swss::Selectable**, unsigned int) () from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#8 0x00007fd2a451e54b in swss::Select::select(swss::Selectable**, int) ()
from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#9 0x00007fd2a4766c6b in _wrap_Select_select__SWIG_0 (args=)
at swsscommon_wrap.cpp:13075
#10 _wrap_Select_select (self=, args=)
at swsscommon_wrap.cpp:13162
#11 0x000055e19cf59a6f in PyEval_EvalFrameEx ()
#12 0x000055e19cf57c35 in PyEval_EvalCodeEx ()
#13 0x000055e19cf5f3e9 in PyEval_EvalFrameEx ()
#14 0x000055e19cf5f0df in PyEval_EvalFrameEx ()
---Type to continue, or q to quit---
#15 0x000055e19cf5f0df in PyEval_EvalFrameEx ()
#16 0x000055e19cf57c35 in PyEval_EvalCodeEx ()
#17 0x000055e19cf579d9 in PyEval_EvalCode ()
#18 0x000055e19cf8844f in ?? ()
#19 0x000055e19cf83262 in PyRun_FileExFlags ()
#20 0x000055e19cf82dae in PyRun_SimpleFileExFlags ()
#21 0x000055e19cf335f1 in Py_Main ()
#22 0x00007fd2a6f182e1 in __libc_start_main ()
from /lib/x86_64-linux-gnu/libc.so.6
#23 0x000055e19cf32e1a in _start ()
(gdb)
Kernal crash logs
====================
root@sonic:~# warm-reboot -vvv
Thu Aug 27 04:13:00 UTC 2020 Pausing orchagent ...
Thu Aug 27 04:13:00 UTC 2020 Stopping nat ...
Error response from daemon: Cannot kill container: nat: No such container: nat
Thu Aug 27 04:13:01 UTC 2020 Stopped nat ...
Thu Aug 27 04:13:01 UTC 2020 Stopping radv ...
Thu Aug 27 04:13:01 UTC 2020 Stopping bgp ...
Thu Aug 27 04:13:02 UTC 2020 Stopped bgp ...
Thu Aug 27 04:13:06 UTC 2020 Initialize pre-shutdown ...
Thu Aug 27 04:13:07 UTC 2020 Requesting pre-shutdown ...
Thu Aug 27 04:13:08 UTC 2020 Waiting for pre-shutdown ...
Thu Aug 27 04:13:08 UTC 2020 Pre-shutdown succeeded ...
Thu Aug 27 04:13:08 UTC 2020 Backing up database ...
Thu Aug 27 04:13:10 UTC 2020 Stopping teamd ...
Thu Aug 27 04:13:10 UTC 2020 Stopped teamd ...
Thu Aug 27 04:13:10 UTC 2020 Stopping syncd ...
[ 436.866859] BUG: unable to handle kernel paging request at ffff8e240a077ffe
[ 436.950307] IP: [] free_block+0x107/0x1c0
[ 437.017073] PGD 3eb334067 [ 437.047337] PUD 0
[ 437.071360]
[ 437.089145] Oops: 0002 [#1] SMP
[ 437.126688] Modules linked in: dummy xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 xt_TCPMSS xt_tcpudp iptable_mangle optoe at24 nvmem_core dell_z9264f_fpga_ocores(O) i2c_ocores ipmi_devintf i2c_mux_pca954x i2c_mux i2c_dev nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo qat_c3xxx intel_qat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel bonding linux_knet_cb(O) linux_bcm_knet(O) dh_generic psample(O) linux_user_bde(O) linux_kernel_bde(O) evdev dcdbas kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel efi_pstore sg efivars button rsa_generic asn1_decoder mpi pcspkr ipmi_si iTCO_wdt ipmi_msghandler iTCO_vendor_support authenc shpchp ip6table_filter ip6_tables xt_conntrack ebt_vlan iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 ebtable_broute bridge xt_addrtype stp llc iptable_nat nf_conntrack_ipv4 ebtable_nat nf_defrag_ipv4 nf_nat_ipv4 nf_nat ebtable_filter nf_conntrack ebtables efivarfs ip_tables x_tables autofs4 loop ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache nls_utf8 nls_cp437 nls_ascii vfat fat overlay squashfs sd_mod mmc_block crc32c_intel i2c_ismt ahci libahci aesni_intel sdhci_pci xhci_pci sdhci xhci_hcd aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd usbcore usb_common i2c_i801 i2c_smbus mmc_core libata scsi_mod igb i2c_algo_bit dca ptp pps_core thermal
[ 438.528288] CPU: 2 PID: 61 Comm: kworker/2:1 Tainted: G W O 4.9.0-11-2-amd64 #1 Debian 4.9.189-3+deb9u2
[ 438.652153] Hardware name: Dell EMC Z9264F-ON/Z9264F-ON-CPU, BIOS 3.42.0.B-4b 04/16/2020
[ 438.748977] Workqueue: events cache_reap
[ 438.795978] task: ffff8e236d267000 task.stack: ffffaf6c81d2c000
[ 438.866805] RIP: 0010:[] [] free_block+0x107/0x1c0
[ 438.962692] RSP: 0018:ffffaf6c81d2fd18 EFLAGS: 00010086
[ 439.026236] RAX: 00000000ffffffff RBX: ffff8e237fd1bca0 RCX: 00000000fffffffe
[ 439.111619] RDX: 000000000000281e RSI: 0000000000000000 RDI: ffff8e230a078000
[ 439.197003] RBP: ffff8e236f0009c0 R08: 0000000000000000 R09: 0000000000018644
[ 439.282385] R10: 0000000000000002 R11: 0000000000000001 R12: ffff8e230a078000
[ 439.367769] R13: fffffb0f90281e20 R14: fffffb0f90281e00 R15: ffff8e236f001180
[ 439.453153] FS: 0000000000000000(0000) GS:ffff8e237fd00000(0000) knlGS:0000000000000000
[ 439.549977] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 439.618718] CR2: ffff8e240a077ffe CR3: 00000003eb008000 CR4: 0000000000340670
[ 439.704099] Stack:
[ 439.728123] ffffaf6c81d2fd98 ffff8e237fd1bca8 ffff8e236f001188 ffff8e236f0011a8
[ 439.817043] ffff8e237fd1bc90 0000000000000001 0000000000000001 ffff8e237fd1bca0
[ 439.905962] 0000000000000000 ffff8e237fd16d88 ffffffffa99e72af ffff8e237fd1bc90
[ 439.994880] Call Trace:
[ 440.024106] [] ? drain_array_locked+0x5f/0x90
[ 440.095971] [] ? drain_array+0x85/0xb0
[ 440.160553] [] ? cache_reap+0x7c/0x210
[ 440.225137] [] ? process_one_work+0x18a/0x430
[ 440.297000] [] ? worker_thread+0x4d/0x490
[ 440.364705] [] ? process_one_work+0x430/0x430
[ 440.436568] [] ? kthread+0xd9/0xf0
[ 440.496991] [] ? __switch_to_asm+0x41/0x70
[ 440.565734] [] ? kthread_park+0x60/0x60
[ 440.631358] [] ? ret_from_fork+0x57/0x70
[ 440.698019] Code: 0f af f9 0f b6 4d 1c 48 c1 ef 20 29 fa d3 ea 0f b6 4d 1d 01 fa 49 8b 7e 10 d3 ea 8d 48 ff 48 85 ff 41 89 4e 18 0f 84 94 00 00 00 <88> 14 0f 41 8b 46 18 85 c0 0f 84 46 ff ff ff 49 8b 77 10 48 8b
[ 440.930560] RIP [] free_block+0x107/0x1c0
[ 440.998369] RSP
[ 441.040071] CR2: ffff8e240a077ffe
[ 441.079698] ---[ end trace ef4c826a61da8907 ]---
[ 441.139688] general protection fault: 0000 [#2] SMP
[ 441.198048] Modules linked in: dummy xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 xt_TCPMSS xt_tcpudp iptable_mangle optoe at24 nvmem_core dell_z9264f_fpga_ocores(O) i2c_ocores ipmi_devintf i2c_mux_pca954x i2c_mux i2c_dev nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo qat_c3xxx intel_qat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel bonding linux_knet_cb(O) linux_bcm_knet(O) dh_generic psample(O) linux_user_bde(O) linux_kernel_bde(O) evdev dcdbas kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel efi_pstore sg efivars button rsa_generic asn1_decoder mpi pcspkr ipmi_si iTCO_wdt ipmi_msghandler iTCO_vendor_support authenc shpchp ip6table_filter ip6_tables xt_conntrack ebt_vlan iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 ebtable_broute bridge xt_addrtype stp llc iptable_nat nf_conntrack_ipv4 ebtable_nat nf_defrag_ipv4 nf_nat_ipv4 nf_nat ebtable_filter nf_conntrack ebtables efivarfs ip_tables x_tables autofs4 loop ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache nls_utf8 nls_cp437 nls_ascii vfat fat overlay squashfs sd_mod mmc_block crc32c_intel i2c_ismt ahci libahci aesni_intel sdhci_pci xhci_pci sdhci xhci_hcd aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd usbcore usb_common i2c_i801 i2c_smbus mmc_core libata scsi_mod igb i2c_algo_bit dca ptp pps_core thermal
[ 442.599642] CPU: 2 PID: 61 Comm: kworker/2:1 Tainted: G D W O 4.9.0-11-2-amd64 #1 Debian 4.9.189-3+deb9u2
[ 442.723509] Hardware name: Dell EMC Z9264F-ON/Z9264F-ON-CPU, BIOS 3.42.0.B-4b 04/16/2020
[ 442.820343] task: ffff8e236d267000 task.stack: ffffaf6c81d2c000
[ 442.891151] RIP: 0010:[] [] __wake_up_common+0x28/0x90
[ 442.991203] RSP: 0018:ffffaf6c81d2fe70 EFLAGS: 00010002
[ 443.054744] RAX: ad0ff1a9a62fa7a5 RBX: ffffaf6c81d2ff10 RCX: 0000000000000000
[ 443.140126] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffaf6c81d2ff10
[ 443.225511] RBP: ffffaf6c81d2ff18 R08: 0000000000000000 R09: 000000007deec019
[ 443.310892] R10: 0000000000000018 R11: 0000000000000000 R12: 0000000000000282
[ 443.396275] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000046
[ 443.481660] FS: 0000000000000000(0000) GS:ffff8e237fd00000(0000) knlGS:0000000000000000
[ 443.578486] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 443.647226] CR2: 0000000000000028 CR3: 00000003eb008000 CR4: 0000000000340670
[ 443.732609] Stack:
[ 443.756633] 0000000100000000 ffffaf6c81d2ff10 ffffaf6c81d2ff08 0000000000000282
[ 443.845552] 0000000000000000 0000000000000001 0000000000000046 ffffffffa98be3f1
[ 443.934470] ffff8e236d267720 ffff8e236d267000 0000000000000000 ffffffffa98780c0
[ 444.023390] Call Trace:
[ 444.052617] [] ? complete+0x31/0x40
[ 444.114081] [] ? mm_release+0xb0/0x130
[ 444.178663] [] ? do_exit+0x14f/0xb60
[ 444.241167] [] ? rewind_stack_do_exit+0x17/0x20
[ 444.315107] Code: 00 00 00 0f 1f 44 00 00 41 57 41 56 41 55 41 54 41 89 cd 55 53 48 89 fd 48 83 c5 08 48 83 ec 08 48 8b 47 08 89 54 24 04 48 39 c5 <48> 8b 08 74 46 48 8d 78 e8 4c 8d 79 e8 41 89 f6 4d 89 c4 8b 1f
[ 444.547648] RIP [] __wake_up_common+0x28/0x90
[ 444.619617] RSP
[ 444.661320] ---[ end trace ef4c826a61da8908 ]---
[ 444.721299] Fixing recursive fault but reboot is needed!
[ 444.784777] BUG: unable to handle kernel paging request at ffffffffffffffd8
[ 444.868278] IP: [] kthread_data+0xc/0x20
[ 444.934007] PGD 3eb00d067 [ 444.964268] PUD 3eb00f067
PMD 0 [ 445.002953]
[ 445.020742] Oops: 0000 [#3] SMP
[ 445.058283] Modules linked in: dummy xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 xt_TCPMSS xt_tcpudp iptable_mangle optoe at24 nvmem_core dell_z9264f_fpga_ocores(O) i2c_ocores ipmi_devintf i2c_mux_pca954x i2c_mux i2c_dev nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo qat_c3xxx intel_qat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel bonding linux_knet_cb(O) linux_bcm_knet(O) dh_generic psample(O) linux_user_bde(O) linux_kernel_bde(O) evdev dcdbas kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel efi_pstore sg efivars button rsa_generic asn1_decoder mpi pcspkr ipmi_si iTCO_wdt ipmi_msghandler iTCO_vendor_support authenc shpchp ip6table_filter ip6_tables xt_conntrack ebt_vlan iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 ebtable_broute bridge xt_addrtype stp llc iptable_nat nf_conntrack_ipv4 ebtable_nat nf_defrag_ipv4 nf_nat_ipv4 nf_nat ebtable_filter nf_conntrack ebtables efivarfs ip_tables x_tables autofs4 loop ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache nls_utf8 nls_cp437 nls_ascii vfat fat overlay squashfs sd_mod mmc_block crc32c_intel i2c_ismt ahci libahci aesni_intel sdhci_pci xhci_pci sdhci xhci_hcd aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd usbcore usb_common i2c_i801 i2c_smbus mmc_core libata scsi_mod igb i2c_algo_bit dca ptp pps_core thermal
[ 446.459882] CPU: 2 PID: 61 Comm: kworker/2:1 Tainted: G D W O 4.9.0-11-2-amd64 #1 Debian 4.9.189-3+deb9u2
[ 446.583748] Hardware name: Dell EMC Z9264F-ON/Z9264F-ON-CPU, BIOS 3.42.0.B-4b 04/16/2020
[ 446.680581] task: ffff8e236d267000 task.stack: ffffaf6c81d2c000
[ 446.751389] RIP: 0010:[] [] kthread_data+0xc/0x20
[ 446.846239] RSP: 0018:ffffaf6c81d2fe70 EFLAGS: 00010002
[ 446.909780] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 446.995165] RDX: 00000299830798e6 RSI: ffff8e236d267080 RDI: ffff8e236d267000
[ 447.080547] RBP: ffffaf6c81d2fec8 R08: ffff8e236d2670a8 R09: 00000000000a0c00
[ 447.165932] R10: 0000000000000000 R11: ffff8e236d267080 R12: ffff8e237fd18980
[ 447.251316] R13: ffff8e236d267000 R14: ffff8e236d2675c0 R15: 0000000000000046
[ 447.336699] FS: 0000000000000000(0000) GS:ffff8e237fd00000(0000) knlGS:0000000000000000
[ 447.433524] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 447.502264] CR2: 0000000000000028 CR3: 00000003eb008000 CR4: 0000000000340670
[ 447.587646] Stack:
[ 447.611669] ffffffffa9895f7a ffffffffa9e1892c ffffaf6c00000008 00ffaf6c81d2fee8
[ 447.700589] ffff8e237fd18980 ad0ff1a9a62fa7a5 ffff8e236d267000 ffffaf6c81d2fdc8
[ 447.789508] 0000000000000000 0000000000000001 0000000000000046 000000000000000b
[ 447.878425] Call Trace:
[ 447.907652] [] ? wq_worker_sleeping+0xa/0x80
[ 447.978477] [] ? __schedule+0x4ac/0x6f0
[ 448.044101] [] ? schedule+0x32/0x80
[ 448.105563] [] ? do_exit+0x8b6/0xb60
[ 448.168069] [] ? rewind_stack_do_exit+0x17/0x20
[ 448.242009] Code: 0b ab 04 00 84 c0 0f 85 50 ff ff ff eb ab e8 bc ac 04 00 e9 9e fe ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 8b 87 60 05 00 00 <48> 8b 40 d8 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 448.474550] RIP [] kthread_data+0xc/0x20
[ 448.541319] RSP
[ 448.583022] CR2: ffffffffffffffd8
[ 448.622644] ---[ end trace ef4c826a61da8909 ]---
[ 448.682618] Fixing recursive fault but reboot is needed!
Attachemnt
=========
Pls find the syslog attached.
Thanks
Mini
The text was updated successfully, but these errors were encountered: