Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unipi] Avoid hang on reboot after updating EEPROM firmware #1127

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

acostach
Copy link
Contributor

@acostach acostach commented May 24, 2024

This PR attempts to solve a hang on UniPi4 which:

  • Is not always reproducible on the bench
  • Manifests by the device hanging after the spi driver has been unbound/bound, to allow access to the EEPROM
  • When it is reproducible, it is caused by rfkill trying to unregister all led triggers, even if the rfkill is not set as an active trigger
  • This appears to be caused by a unipi module which contains multiple drivers. Some of them, like for example LED drivers, are registered multiple times when the spi driver is re-bound.
  • We attempt to solve this by triggering the unipi module de-initialization, to avoid any interference with the eeprom read/write process, as well as to avoid creating multiple devices for the same LEDs
  • If the unipi module that is currently running does not support de-initialization, we avoid accessing the eeprom

Example failure logs:

May 24 10:25:20 bbe6adf kernel: Unable to handle kernel paging request at virtual address ffffffe2bba62c80
May 24 10:25:20 bbe6adf kernel: Mem abort info:
May 24 10:25:20 bbe6adf kernel:   ESR = 0x0000000096000046
May 24 10:25:20 bbe6adf kernel:   EC = 0x25: DABT (current EL), IL = 32 bits
May 24 10:25:20 bbe6adf kernel:   SET = 0, FnV = 0
May 24 10:25:22 bbe6adf kernel:   EA = 0, S1PTW = 0
May 24 10:25:22 bbe6adf kernel:   FSC = 0x06: level 2 translation fault
May 24 10:25:23 bbe6adf kernel: Data abort info:
May 24 10:25:23 bbe6adf kernel:   ISV = 0, ISS = 0x00000046
May 24 10:25:23 bbe6adf kernel:   CM = 0, WnR = 1

May 24 10:25:24 bbe6adf kernel: swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000010ab000
May 24 10:25:24 bbe6adf kernel: [ffffffe2bba62c80] pgd=100000007ffff003, p4d=100000007ffff003, pud=100000007ffff003, pmd=0000000000000000
May 24 10:25:24 bbe6adf kernel: Internal error: Oops: 96000046 [#1] PREEMPT SMP
May 24 10:25:24 bbe6adf kernel: Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip6table_filter ip6_tables xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilter c>
May 24 10:25:24 bbe6adf kernel: swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000010ab000
May 24 10:25:24 bbe6adf kernel: [ffffffe2bba62c80] pgd=100000007ffff003, p4d=100000007ffff003, pud=100000007ffff003, pmd=0000000000000000
May 24 10:25:24 bbe6adf kernel: Internal error: Oops: 96000046 [#1] PREEMPT SMP
May 24 10:25:24 bbe6adf kernel: Modules linked in: ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip6table_filter ip6_tables xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilter c>
May 24 10:25:24 bbe6adf kernel: CPU: 1 PID: 2345 Comm: hciattach Tainted: G         C O      5.15.92-v8 #1
May 24 10:25:24 bbe6adf kernel: Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
May 24 10:25:24 bbe6adf kernel: pstate: 800000c5 (Nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
May 24 10:25:24 bbe6adf kernel: pc : queued_spin_lock_slowpath+0x1e8/0x2c8
May 24 10:25:24 bbe6adf kernel: lr : queued_spin_lock_slowpath+0xa8/0x2c8
May 24 10:25:24 bbe6adf kernel: sp : ffffffc008e8ba60
May 24 10:25:24 bbe6adf kernel: x29: ffffffc008e8ba60 x28: ffffff8042088000 x27: ffffffe2bbd27000
May 24 10:25:24 bbe6adf kernel: x26: 0000000000000000 x25: 0000000000000000 x24: ffffff807fb69c88
May 24 10:25:24 bbe6adf kernel: x23: ffffff807fb69c80 x22: 0000000000080000 x21: ffffff807fb69c80
May 24 10:25:24 bbe6adf kernel: x20: ffffffe2bba62c80 x19: ffffff80412e87dc x18: 0000000000000000
May 24 10:25:24 bbe6adf kernel: x17: 0000000000000005 x16: ffffffe2b98044fc x15: 0000000600000006
May 24 10:25:24 bbe6adf kernel: x14: 0000000000000299 x13: 0000000000000000 x12: 0000000000000005
May 24 10:25:24 bbe6adf kernel: x11: ffffffe2bbd44af8 x10: 0000000000000006 x9 : ffffffe2b90cb7ac
May 24 10:25:24 bbe6adf kernel: x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000001
May 24 10:25:24 bbe6adf kernel: x5 : ffffff807fb690b0 x4 : ffffffe2b9e20890 x3 : ffffff80412e87de
May 24 10:25:24 bbe6adf kernel: x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffffe2bba62c80
May 24 10:25:24 bbe6adf kernel: Call trace:
May 24 10:25:24 bbe6adf kernel:  queued_spin_lock_slowpath+0x1e8/0x2c8
May 24 10:25:24 bbe6adf kernel:  do_raw_spin_lock+0x30/0x40
May 24 10:25:24 bbe6adf kernel:  _raw_spin_lock_irq+0x40/0x50
May 24 10:25:24 bbe6adf kernel:  __down_write_common+0x484/0x4dc
May 24 10:25:24 bbe6adf kernel:  down_write+0x1c/0x28
May 24 10:25:24 bbe6adf kernel:  led_trigger_unregister+0xc4/0xf0
May 24 10:25:24 bbe6adf kernel:  rfkill_unregister+0xa0/0xb0 [rfkill]
May 24 10:25:24 bbe6adf kernel:  hci_unregister_dev+0x118/0x148 [bluetooth]
May 24 10:25:24 bbe6adf kernel:  hci_uart_tty_close+0x90/0xd8 [hci_uart]
May 24 10:25:24 bbe6adf kernel:  tty_ldisc_close+0x4c/0x5c
May 24 10:25:24 bbe6adf kernel:  tty_set_ldisc+0xb4/0x1cc
May 24 10:25:24 bbe6adf kernel:  tty_ioctl+0x460/0x758
May 24 10:25:24 bbe6adf kernel:  vfs_ioctl+0x30/0x50
May 24 10:25:24 bbe6adf kernel:  __arm64_sys_ioctl+0x80/0xb4
May 24 10:25:24 bbe6adf kernel:  invoke_syscall+0x84/0x11c
May 24 10:25:24 bbe6adf kernel:  el0_svc_common.constprop.0+0xcc/0x100
May 24 10:25:24 bbe6adf kernel:  do_el0_svc+0x50/0x90
May 24 10:25:24 bbe6adf kernel:  el0_svc+0x24/0x54
May 24 10:25:24 bbe6adf kernel:  el0t_64_sync_handler+0xb4/0x134
May 24 10:25:24 bbe6adf kernel:  el0t_64_sync+0x1a0/0x1a4
May 24 10:25:24 bbe6adf kernel: Code: 51000421 8b000280 f861db01 910022f8 (f8216817) 
May 24 10:25:24 bbe6adf kernel: ---[ end trace 716c568e2115c0b0 ]---
May 24 10:25:25 bbe6adf kernel: note: hciattach[2345] exited with preempt_count 1

Upstream ticket: UniPiTechnology/unipi-kernel-modules-v1#1

in the unipi module. This module cannot be unloaded at runtime,
because it contains multiple drivers running in different threads.

Force unloading it will cause a system hang, similar to the one
observed if it registeres led drivers twice and rfkill attempts
to unregister its' triggers during reboot from the devices created
by them.

Changelog-entry: unipi-kernel-modules: Allow forced deinitialization of drivers
Signed-off-by: Alexandru Costache <alexandru@balena.io>
to avoid any interference with flashrom as well with rfkill
unregister, and prevent reboot from hanging.

Signed-off-by: Alexandru Costache <alexandru@balena.io>
@acostach acostach requested review from alexgg and floion May 24, 2024 14:33
@flowzone-app flowzone-app bot enabled auto-merge May 24, 2024 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant