Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uefi: patch eventual crash that stops device from booting #1734

Open
wants to merge 2 commits into
base: kirkstone
Choose a base branch
from

Conversation

brt-sarah-newman
Copy link

With r35.5.0, after some hundreds of reboots, it was observed that the EFI variable data would become corrupted without recovering.

The patch

fix: reset the meas buffer after computing the first measurement

Already applied to 35.6.0, combined with these two,
led to the device successfully booting without clearing the
EFI variables.

With r35.5.0, after some hundreds of reboots, it was observed
that the EFI variable data would become corrupted without
recovering.

The patch

fix: reset the meas buffer after computing the first measurement

Already applied to 35.6.0, combined with these two,
 led to the device successfully booting without clearing the
EFI variables.

Signed-off-by: Sarah Newman <sarah.newman@bluerivertech.com>
@brt-sarah-newman
Copy link
Author

This is 100% untested with jetpack 5.1.4

@brt-sarah-newman
Copy link
Author

brt-sarah-newman commented Nov 15, 2024

Before patch 1:

Ftw: Init.. find first record not SpareCompleted, abort()
E/TC:?? 00 
E/TC:?? 00 User mode data-abort at address 0x40 (translation fault)
E/TC:?? 00  esr 0x92000005  ttbr0 0x200103c1b3000   ttbr1 0x00000000   cidr 0x0
E/TC:?? 00  cpu #0          cpsr 0x60000000
E/TC:?? 00  x0  0000000000000000 x1  0000000000000021
E/TC:?? 00  x2  0000000000000021 x3  0000000040707c78
E/TC:?? 00  x4  0000000040707c77 x5  00000000406e9e70
E/TC:?? 00  x6  00000000000000fc x7  00000000000000fc
E/TC:?? 00  x8  00000000000000fc x9  0000000000000000
E/TC:?? 00  x10 0000000000000000 x11 0000000000000000
E/TC:?? 00  x12 0000000000000002 x13 0000000000000002
E/TC:?? 00  x14 0000000000000001 x15 00000000000000ff
E/TC:?? 00  x16 0000000040614118 x17 00000000000000f8
E/TC:?? 00  x18 0000000000000000 x19 0000000040701120
E/TC:?? 00  x20 0000000000000020 x21 0000000040707c78
E/TC:?? 00  x22 000000000000003f x23 0000000040707c77
E/TC:?? 00  x24 0000000000000000 x25 0000000000001000
E/TC:?? 00  x26 00000000406f2000 x27 000000000003ffe0
E/TC:?? 00  x28 000000004253464e x29 0000000040707bd0
E/TC:?? 00  x30 00000000406120d8 elr 00000000406e9ef0
E/TC:?? 00  sp_el0 0000000040707bd0
E/TC:?? 00  region  0: va 0x0000000040000000 pa 0x000000103c042000 size 0x002000 flags ---R-X
E/TC:?? 00  region  1: va 0x0000000040002000 pa 0x000000103c189000 size 0x001000 flags ---RW-
E/TC:?? 00  region  2: va 0x0000000040004000 pa 0x000000103c240000 size 0x03f000 flags r-xR--
E/TC:?? 00  region  3: va 0x0000000040043000 pa 0x000000103c27f000 size 0x001000 flags rw----
E/TC:?? 00  region  4: va 0x0000000040044000 pa 0x000000103c280000 size 0x00b000 flags r-x---
E/TC:?? 00  region  5: va 0x000000004004f000 pa 0x000000103c28b000 size 0x001000 flags rw----
E/TC:?? 00  region  6: va 0x0000000040050000 pa 0x000000103c28c000 size 0x001000 flags r-x---
E/TC:?? 00  region  7: va 0x0000000040051000 pa 0x000000103c28d000 size 0x2b3000 flags r-xR--
E/TC:?? 00  region  8: va 0x0000000040304000 pa 0x000000103c540000 size 0x30e000 flags rw-RW-
E/TC:?? 00  region  9: va 0x0000000040612000 pa 0x000000103c84e000 size 0x008000 flags r-x---
E/TC:?? 00  region 10: va 0x000000004061a000 pa 0x000000103c856000 size 0x001000 flags rw-RW-
E/TC:?? 00  region 11: va 0x000000004061b000 pa 0x000000103c857000 size 0x001000 flags r-x---
E/TC:?? 00  region 12: va 0x000000004061c000 pa 0x000000103c858000 size 0x002000 flags rw-RW-
E/TC:?? 00  region 13: va 0x000000004061e000 pa 0x000000103c85a000 size 0x005000 flags r-x---
E/TC:?? 00  region 14: va 0x0000000040623000 pa 0x000000103c85f000 size 0x001000 flags rw-RW-
E/TC:?? 00  region 15: va 0x0000000040624000 pa 0x000000103c860000 size 0x001000 flags r-x---
E/TC:?? 00  region 16: va 0x0000000040625000 pa 0x000000103c861000 size 0x0c3000 flags rw-RW-
E/TC:?? 00  region 17: va 0x00000000406e8000 pa 0x000000103c924000 size 0x00a000 flags r-x---
E/TC:?? 00  region 18: va 0x00000000406f2000 pa 0x000000103c92e000 size 0x001000 flags rw-RW-
E/TC:?? 00  region 19: va 0x00000000406f3000 pa 0x000000103c92f000 size 0x001000 flags r-x---
E/TC:?? 00  region 20: va 0x00000000406f4000 pa 0x000000103c930000 size 0x002000 flags rw-RW-
E/TC:?? 00  region 21: va 0x00000000406f6000 pa 0x000000103c932000 size 0x006000 flags r-x---
E/TC:?? 00  region 22: va 0x00000000406fc000 pa 0x000000103c938000 size 0x001000 flags rw-RW-
E/TC:?? 00  region 23: va 0x00000000406fd000 pa 0x000000103c939000 size 0x001000 flags r-x---
E/TC:?? 00  region 24: va 0x00000000406fe000 pa 0x000000103c93a000 size 0x01f000 flags rw-RW-
E/TC:?? 00  region 25: va 0x000000004071d000 pa 0x000000103c959000 size 0x015000 flags rw-RW-
E/TC:?? 00  region 26: va 0x0000000040732000 pa 0x000000000c198000 size 0x001000 flags rw----
E/TC:?? 00  region 27: va 0x0000000040733000 pa 0x0000000003270000 size 0x010000 flags rw----
E/TC:?? 00  region 28: va 0x0000000040743000 pa 0x000000000c390000 size 0x002000 flags rw----

Grep showing module load order versus abort:

$ grep -aE 'GetLastValidMeasurements:|FtwAbort|Booting Linux on|ErrorASSERT|StandaloneMmCpu|FaultTolerant|User mode data-abort at address' minicom-20240909.11.cap 
Loading MM driver at 0x00040611000 EntryPoint=0x000406163CC FaultTolerantWriteStandaloneMm.efi
E/TC:?? 00 User mode data-abort at address 0x40 (translation fault)

The first patch “FvbWrite: Return EFI_NOT_READY when VarInt is null” hides the issue, but it didn't seem entirely correct because everything needed should be initialized by that stage.

The second patch “Move StandaloneMm.common.fdf.inc before FaultTolerantWriteStandaloneMm.inf” prevents calling FvbWrite before VarInt is initialized and would boot to Linux on the first boot, but then would fail validation on subsequent boots:

$ grep -aE 'GetLastValidMeasurements:|FtwAbort|Booting Linux on|ErrorASSERT|StandaloneMmCpu|FaultTolerant|User mode data-abort at address' minicom-20240910.17.cap 
Loading MM driver at 0x00040615000 EntryPoint=0x00040617A98 StandaloneMmCpu.efi
Loading MM driver at 0x000404E0000 EntryPoint=0x000404E53CC FaultTolerantWriteStandaloneMm.efi
GetLastValidMeasurements: More than 2 Valid measurements found FE
GetLastValidMeasurements: More than 2 Valid measurements found FE
GetLastValidMeasurements: More than 2 Valid measurements found FE
GetLastValidMeasurements: More than 2 Valid measurements found FE
��[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd421]
Loading MM driver at 0x00040615000 EntryPoint=0x00040617A98 StandaloneMmCpu.efi
GetLastValidMeasurements: More than 2 Valid measurements found FE
MmFvbSmmVarReady:Var Store Validation failed Device ErrorASSERT [FvbNorFlashStandaloneMm] /workdir/standalone-mm-optee-tegra/35.5.0-r0/edk2-tegra/edk2-nvidia/Silicon/NVIDIA/Drivers/FvbNorFlashDxe/FvbNorFlashStandaloneMm.c(873): ((BOOLEAN)(0==1))

“fix: reset the meas buffer after computing the first measurement” from Girish Mahadevan gmahadevan@nvidia.com fixes the “more than 2 valid measurements found” when applied to Jetpack 5.1.3. That patch is already in Jetpack 5.1.4+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant