Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accton AS7712-32x kernel panic #2201

Closed
mslocrian opened this issue Oct 26, 2018 · 4 comments
Closed

Accton AS7712-32x kernel panic #2201

mslocrian opened this issue Oct 26, 2018 · 4 comments
Assignees

Comments

@mslocrian
Copy link
Contributor

Description
I was having some issues with some interfaces (not receiving light in the transcievers). I reinstalled an alternative NOS for some sanity check, and saw light. I decided to reinstall SONiC and then started receiving the following kernel panic upon boot.

[  OK  ] Started LSB: Start NTP daemon.
[  OK  ] Started Update NTP configuration.
[  OK  ] Started Platform monitor container.

Debian GNU/Linux 9 sonic ttyS1

sonic login: [   43.931211] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[   43.939982] IP: [<ffffffffc0896809>] _find_lm75_device+0x99/0x130 [accton_as7712_32x_fan]
[   43.949132] PGD 800000042cd00067 [   43.952636] PUD 42ccff067 
PMD 0 [   43.956239] 
[   43.957901] Oops: 0000 [#1] SMP
[   43.961406] Modules linked in: lm75 at24 nvmem_core accton_as7712_32x_psu(O) leds_accton_as7712_32x(O) optoe accton_as7712_32x_fan(O) ym2651y(O) accton_i2c_cpld(O) i2c_mux_pca954x i2c_mux i2c_dev br_netfilter bridge stp llc linux_bcm_knet(O) linux_user_bde(O) linux_kernel_bde(O) iTCO_wdt gpio_ich iTCO_vendor_support intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul evdev sg ghash_clmulni_intel serio_raw intel_cstate pcspkr shpchp button lpc_ich mfd_core acpi_cpufreq ip6table_filter ip6_tables ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_addrtype iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter iptable_mangle ip_tables x_tables autofs4 loop ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache nls_utf8 nls_cp437 nls_ascii vfat fat overlay squashfs sd_mod crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd i2c_i801 i2c_smbus ahci libahci ehci_pci ehci_hcd libata usbcore igb scsi_mod i2c_algo_bit usb_common dca ptp pps_core i2c_ismt
[   44.064477] CPU: 0 PID: 1954 Comm: fancontrol Tainted: G        W  O    4.9.0-7-amd64 #1 Debian 4.9.110-3+deb9u2
[   44.075855] Hardware name: Accton AS7712-32X/AS7712-32X, BIOS 5.6.5 12/26/2016
[   44.083929] task: ffffa03bad44aec0 task.stack: ffffb4d242094000
[   44.090543] RIP: 0010:[<ffffffffc0896809>]  [<ffffffffc0896809>] _find_lm75_device+0x99/0x130 [accton_as7712_32x_fan]
[   44.102416] RSP: 0018:ffffb4d242097da0  EFLAGS: 00010286
[   44.108351] RAX: ffffa03be8c78c20 RBX: 0000000000000000 RCX: 0000000000000000
[   44.116325] RDX: 0000000000000048 RSI: 0000000000000000 RDI: ffffa03be9d63800
[   44.124299] RBP: ffffa03bebc0ad00 R08: 00000000ffff05ef R09: ffffa03beba54540
[   44.132274] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc0896770
[   44.140249] R13: 0000000000000080 R14: 0000000000000001 R15: ffffa03be9d12240
[   44.148224] FS:  00007fab93aa0700(0000) GS:ffffa03bffc00000(0000) knlGS:0000000000000000
[   44.157269] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   44.163688] CR2: 0000000000000050 CR3: 000000042cd12000 CR4: 0000000000100670
[   44.171661] Stack:
[   44.173905]  00000000ebc0ad00 514d7179888e8efa 0000000000000000 ffffa03bebc0ad00
[   44.182193]  ffffffff8587cc69 ffffa03bed1f42a8 ffffa03be9ddfc68 514d7179888e8efa
[   44.190479]  ffffa03bebc0ad00 ffffffffc0896770 ffffa03be8e7b000 ffffffff858bfebd
[   44.198755] Call Trace:
[   44.201493]  [<ffffffff8587cc69>] ? bus_for_each_dev+0x69/0xb0
[   44.208016]  [<ffffffffc0896770>] ? read_devfile_temp1_input+0x1f0/0x1f0 [accton_as7712_32x_fan]
[   44.217841]  [<ffffffff858bfebd>] ? i2c_for_each_dev+0x2d/0x40
[   44.224361]  [<ffffffffc0896174>] ? get_sys_temp+0x34/0x90 [accton_as7712_32x_fan]
[   44.232826]  [<ffffffff8587a36f>] ? dev_attr_show+0x1f/0x50
[   44.239045]  [<ffffffff85687e80>] ? kernfs_seq_start+0x30/0x90
[   44.245563]  [<ffffffff856894c8>] ? sysfs_kf_seq_show+0xb8/0x130
[   44.252277]  [<ffffffff8562cee6>] ? seq_read+0x106/0x400
[   44.258214]  [<ffffffff85606bf1>] ? vfs_read+0x91/0x130
[   44.264052]  [<ffffffff856080c2>] ? SyS_read+0x52/0xc0
[   44.269794]  [<ffffffff85605ca0>] ? generic_file_llseek_size+0xe0/0xe0
[   44.277091]  [<ffffffff85403b7d>] ? do_syscall_64+0x8d/0xf0
[   44.283319]  [<ffffffff85a13c4e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[   44.291100] Code: 00 00 66 83 fa 48 74 0c 66 83 fa 49 74 06 66 83 fa 4a 75 4b 48 85 ff 0f 84 8c 00 00 00 48 8b b0 a0 00 00 00 4c 8b 05 f7 e7 76 c5 <48> 8b 4e 50 48 03 4e 48 4c 39 c1 78 40 0f b6 56 41 0f bf 46 58 
[   44.312519] RIP  [<ffffffffc0896809>] _find_lm75_device+0x99/0x130 [accton_as7712_32x_fan]
[   44.321764]  RSP <ffffb4d242097da0>
[   44.325658] CR2: 0000000000000050
[   44.329483] ---[ end trace 3242afafc1cd6912 ]---

Looks like perhaps something is going on with the fan? I can get that checked out, but I also wanted to raise this and inquire if others have seen this, and just bring it to the attention.

The build is:

[   24.320771] rc.local[586]: + [ -d /host/image-master.0-dirty-20181026.080027/platform/x86_64-accton_as7712_32x-r0 ]
[   24.344745] rc.local[586]: + dpkg -i /host/image-master.0-dirty-20181026.080027/platform/x86_64-accton_as7712_32x-r0/sonic-platform-accton-as7712-32x_1.1_amd64.deb
[   24.379307] rc.local[586]: Selecting previously unselected package sonic-platform-accton-as7712-32x.
[   24.400726] rc.local[586]: (Reading database ... 23896 files and directories currently installed.)
[   24.424752] rc.local[586]: Preparing to unpack .../sonic-platform-accton-as7712-32x_1.1_amd64.deb ...

I've no access to a proper show version.

Has anyone else experienced this, or can perhaps shed a clue?

thanks!
stegen

@lguohan
Copy link
Collaborator

lguohan commented Oct 27, 2018

looks like accton driver issue. @roylee123 , can you take a look at this issue?

@mslocrian
Copy link
Contributor Author

@roylee123 , if you need anything from me, please let me know. Will be happy to work with you to get any sort of debugging information that you need.

thanks!

@roylee123
Copy link
Collaborator

It supposed to be fixed at #2197.
I confirmed it's OK for https://sonic-jenkins.westus2.cloudapp.azure.com/job/broadcom/job/buildimage-brcm-all/755.
Even there are still some error message on kernel starting up, but no panic any more.

@mslocrian
Copy link
Contributor Author

Confirmed fixed on:

ethos@sonic:~$ show version
SONiC Software Version: SONiC.master.0-dirty-20181030.084359
Distribution: Debian 9.5
Kernel: 4.9.0-7-amd64
Build commit: f1947bd
Build date: Tue Oct 30 19:56:59 UTC 2018
Built by: stegen@hobbes

Thanks @roylee123 !

prsunny pushed a commit that referenced this issue Mar 30, 2022
* [202012] - sonic-swss submodule update to include following commits:

fca407a (HEAD) [VNET]Fixing nexthop group delete during route change (#2198)
a9b6b47 [vxlan] Remove tunnel map objects on VNET tunnel removal (#2208)
74e9b9f [FdbOrch] SAI_FDB_EVENT_MOVE generates update with empty update.entry.port_name (#2201)
0a99445 [202012][BFD]Registering BFD state change callback during session creation (#2203)
aebe4a1 [VS test] skip dpb flaky test (#2195) (#2207)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants