Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Signature issue on TGL UP Extreme i11 - openssl3 #99

Closed
plbossart opened this issue Jun 14, 2022 · 59 comments
Closed

[BUG] Signature issue on TGL UP Extreme i11 - openssl3 #99

plbossart opened this issue Jun 14, 2022 · 59 comments
Assignees
Labels
bug Something isn't working Signing

Comments

@plbossart
Copy link
Member

plbossart commented Jun 14, 2022

Describe the bug

The firmware generated locally on my NUC does not boot on UpExtreme 11. The daily-build firmware and the 2.1 version boot fine, so it's not a hardware/CSME issue, rather an undocumented or missing environment for the signature to work.

Standard Ubuntu 22.04 on my side. Same scripts work fine on Up Extreme (WHL)

To Reproduce
Steps to reproduce the behavior: (e.g. list commands or actions used to reproduce the bug)

scripts/xtensa-build-all.sh -d tgl (debug version)
scripts/xtensa-build-all.sh tgl

Reproduction Rate

100%

Expected behavior

no errors on boot

Impact

showstopper

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).

    • Kernel: 9eaf67f8c40cc (sof/topic/sof-dev) soundwire: intel: set dev_num_ida_min

    • SOF: e9442e04c (HEAD -> main, origin/main) module_adpater: change the buffer size of waves_codec

  2. Name of the topology file

    • Topology: irrelevant
  3. Name of the platform(s) on which the bug is observed.

    • Platform: TGL UpExtreme i11

Screenshots or console output

[    7.008951] snd_sof_intel_hda_common:hda_cl_copy_fw: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x80000]=0x80000012 timedout
[    7.008957] sof-audio-pci-intel-tgl 0000:00:1f.3: hda_cl_copy_fw: timeout with rom_status_reg (0x80000) read
[    7.008969] snd_sof_intel_hda_common:hda_dsp_stream_trigger: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x140000 successful
[    7.008973] sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[    7.008974] sof-audio-pci-intel-tgl 0000:00:1f.3: Firmware download failed
[    7.008975] sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_IN_PROGRESS (2)
[    7.008978] snd_sof_intel_hda_common:hda_dsp_get_status: sof-audio-pci-intel-tgl 0000:00:1f.3: unknown ROM status value 80000012
[    7.008991] sof-audio-pci-intel-tgl 0000:00:1f.3: extended rom status:  0x80000012 0x2c 0x0 0x0 0x0 0x0 0x2510113 0x0
[    7.008992] sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[    7.009016] sof-audio-pci-intel-tgl 0000:00:1f.3: Failed to start DSP
[    7.009017] sof-audio-pci-intel-tgl 0000:00:1f.3: error: failed to boot DSP firmware -110
[    7.009020] snd_sof:sof_set_fw_state: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state change: 2 -> 3
[    7.061361] snd_sof_intel_hda_common:hda_dsp_core_reset_enter: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x4]=0x1d003c timedout
[    7.061368] sof-audio-pci-intel-tgl 0000:00:1f.3: error: hda_dsp_core_reset_enter: timeout on HDA_DSP_REG_ADSPCS read
[    7.061370] sof-audio-pci-intel-tgl 0000:00:1f.3: error: dsp core reset failed: core_mask 1
[    7.062148] snd_sof:sof_set_fw_state: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state change: 3 -> 0
[    7.062173] sof-audio-pci-intel-tgl 0000:00:1f.3: error: sof_probe_work failed err: -110
@plbossart plbossart added the bug Something isn't working label Jun 14, 2022
@lgirdwood
Copy link
Member

@plbossart can you attach your FW. Would be nice to diff it's headers against v2.1.
Btw, I assume you are signing with community key.? Do both v2.1 community and Intel keys work for you ?

@lgirdwood
Copy link
Member

One more thing, is your rimage submodule up to date ? (and in $PATH)

@plbossart
Copy link
Member Author

git log --oneline
b4886bebb (HEAD -> main, origin/main) module_adapter: add zephyr logging support

cd rimage/
git log --oneline
9d45332 (HEAD) Write firmware file micro version to manifest for cAVS platforms

rimage is not in $PATH. It was not needed before and must be set by scripts.

@plbossart
Copy link
Member Author

compilation log
log.txt

firmware
sof-tgl.ri.gz

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

This reminds me this bug fix: 95d887251ee40397300

@fredoh9
Copy link

fredoh9 commented Jun 14, 2022

Interesting, I compared with mine

  1. SHA1 => same
  2. build log => not so much different
  3. fw binaries => a few bytes are different after "--erase_vars" but not sure that is crucial (thanks @marc-hb )

./sof_ri_info.py ~/Downloads/sof-tgl.ri --erase_vars ~/Downloads/sof-tgl.ri_no_vars

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

fw binaries => a few bytes are different after "--erase_vars" but not sure that is crucial (thanks @marc-hb )

I implemented --erase-vars so the image after --erase_vars is 100% identical when using the same toolchain (which you obviously did, otherwise the differences would be much bigger). So I do find these few bytes difference worrying, can you please share the hexdiff?

Also, please share the manifest differences as shown by diff -u <(sof/tools/sof_ri_info.py image1.ri) <(sof/tools/sof_ri_info.py image2.ri). Some randomness is expected because rimage uses a salt (that's why --erase_vars exists), other differences are not.

BTW --erase-vars is used in CI to test every PR catch any __TIMESTAMP__ (in addition to checkpatch)

@fredoh9
Copy link

fredoh9 commented Jun 14, 2022

Don't know the root cause, but from the build log, Source content hash is different.

This is mine,

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebbe49
-- Source content hash: 1c511e77. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.

This is Pierre's,

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebb
-- Source content hash: 67531a8c. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.

@plbossart
Copy link
Member Author

@keqiaozhang reported a similar issue worked-around with the "scripts/xtensa-build-all.sh -d tgl" option. In my case the debug option doesn't solve anything.

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

Don't know the root cause, but from the build log, Source content hash is different.

This is the very first difference and the one we must focus on first. All other differences could be impacted by this. The source hash is not affected by any of these:

So there is really, absolutely no reason for @plbossart's source hash to be different. I have the same 1c511e77 source hash as @fredoh9 with any toolchain.

@fredoh9
Copy link

fredoh9 commented Jun 14, 2022

@keqiaozhang reported a similar issue worked-around with the "scripts/xtensa-build-all.sh -d tgl" option. In my case the debug option doesn't solve anything.

if -d make difference, this is more serious bug.
Good thing is it worked for both for me, it didn't work for both for Pierre. I like the results, consistency, at least.

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

@plbossart can you please first make sure that git status --ignored is clean (sorry for asking this but we're grasping at straws now), then run the following commands and report which ones don't match

wc build_tgl_?cc/source_hash/*

  1408   1408  57728 build_tgl_gcc/source_hash/tracked_file_hash_list
  1408   1408  61688 build_tgl_gcc/source_hash/tracked_file_list
  2816   2816 119416 total

md5sum build_tgl_?cc/source_hash/*

0c24b8d8a641391d053cba84ec86f20f  build_tgl_gcc/source_hash/tracked_file_hash_list
72028df5e905505089a10e92e4d94781  build_tgl_gcc/source_hash/tracked_file_list
git ls-files src/ scripts/ | md5sum

72028df5e905505089a10e92e4d94781  -
git hash-object src/probe/probe.c

265c6fe9fea227847ba9094acd663e0a421f565a

@fredoh9
Copy link

fredoh9 commented Jun 14, 2022

@marc-hb, I built without -d, I have 100% same with yours above

@plbossart
Copy link
Member Author

plbossart commented Jun 14, 2022

@marc-hb I removed all ignored files and rebuilt, same SHA1

-- GIT_TAG / GIT_LOG_HASH : v2.0-rc1-997-gb4886bebbe49 / b4886bebb
-- Source content hash: 67531a8c. Note: by design, source hash is broken by config changes. See thesofproject/sof#3890.
wc build_tgl_?cc/source_hash/*
  1408   1408  57728 build_tgl_xcc/source_hash/tracked_file_hash_list << DIFFERENT, I have xcc only?
  1408   1408  61688 build_tgl_xcc/source_hash/tracked_file_list
  2816   2816 119416 total

md5sum build_tgl_?cc/source_hash/*
0498f2ce4d4636c3b5ded04c22bb04f2  build_tgl_xcc/source_hash/tracked_file_hash_list <<< DIFFERENT, xcc only?
72028df5e905505089a10e92e4d94781  build_tgl_xcc/source_hash/tracked_file_list

git ls-files src/ scripts/ | md5sum
72028df5e905505089a10e92e4d94781  - << SAME

git hash-object src/probe/probe.c 
265c6fe9fea227847ba9094acd663e0a421f565a << SAME

@plbossart
Copy link
Member Author

FWIW, I cloned a clean sof and same results, this difference in source hash is not due to my local setup or files that might have side effects.

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

Fascinating, so you have the same list of files and same git commit but git hash-object is different for some source files. Let's find which files have a different hash.

First, please run git rev-parse --show-object-format. If it says sha256 then tell us and stop reading (seems unlikely considering the git version is the same)

If it says sha1 then please run paste build_tgl_?cc/source_hash/* and diff -u the output with mine (attached) 5917_source_hashes.txt

Roughly how many files have a different hash? If a small number then which ones?

I have xcc only?

None of this depends on the toolchain, it's pure source and git.

@plbossart
Copy link
Member Author

diff -u ~/Downloads/5917_source_hashes.txt  plb_hash.txt 
--- /home/pbossart/Downloads/5917_source_hashes.txt	2022-06-14 16:04:19.667677436 -0500
+++ plb_hash.txt	2022-06-14 16:05:22.211575742 -0500
@@ -134,7 +134,7 @@
 cc203436ad67d5fc42f6205a8815f55c7d061418	src/arch/xtensa/hal/mp_asm.S
 bacbfc6ff0c71b0aba8bb31c16ed345feefdfacc	src/arch/xtensa/hal/mpu.c
 a2a544bd354d6cf4c30c8bb326ec1173694bc39c	src/arch/xtensa/hal/mpu_asm.S
-b1b53ed4ab216f6a0c8e7c628d93de627ac370b1	src/arch/xtensa/hal/set_region_translate.c
+27ed6b80a50b1b89f7f8b7653355f25ad0cb9932	src/arch/xtensa/hal/set_region_translate.c
 316ddb4e829827a7b1637415030a1c2c37121e07	src/arch/xtensa/hal/state.c
 108986228584696b4c6a235fecd0760f8b4c2ca7	src/arch/xtensa/hal/state_asm.S
 0716ddca17ff2586d31b94c3ab1d5bca14377355	src/arch/xtensa/hal/syscache_asm.S
@@ -158,7 +158,7 @@
 44874cd946df8d92f28241df54854a73ecdec15c	src/arch/xtensa/include/arch/spinlock.h
 1172cb488b88d4966b5f8a4b57d9deb9c3bcfddc	src/arch/xtensa/include/arch/string.h
 c6b04a250575a0f9616d4c59385db05311b35163	src/arch/xtensa/include/xtensa/board.h
-4b17987ea95c462625f792507f624e241776fa0e	src/arch/xtensa/include/xtensa/c6x-compat.h
+ca91bd7183971221923b6e882796e88d0acfb3cd	src/arch/xtensa/include/xtensa/c6x-compat.h
 9cb2c8fcc6b85f6d21d78ad0755285a8fe5d27f7	src/arch/xtensa/include/xtensa/cacheasm.h
 211803aedbf39318f912fcc291efafad970f78ef	src/arch/xtensa/include/xtensa/cacheattrasm.h
 f5bb44faf2ab30bdb3de131f86cdb420b244cec6	src/arch/xtensa/include/xtensa/config/core.h

Absolutely no idea what this is.

@plbossart
Copy link
Member Author

one possibility is that I don't have gcc installed for TGL. I never use GCC anyways even for older hardware.

@plbossart
Copy link
Member Author

git --version
git version 2.34.1

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

Only 2 files are different, all others the same?

Can you please run these:

git hash-object src/arch/xtensa/hal/set_region_translate.c
b1b53ed4ab216f6a0c8e7c628d93de627ac370b1

md5sum src/arch/xtensa/hal/set_region_translate.c
36cae0b29a2c1b3a65f1e9dbb3bb829b  src/arch/xtensa/hal/set_region_translate.c

git cat-file -p 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932 | md5sum

git cat-file -p b1b53ed4ab216f6a0c8e7c628d93de627ac370b1 | md5sum
36cae0b29a2c1b3a65f1e9dbb3bb829b  -


wget https://raw.githubusercontent.com/thesofproject/sof/b4886bebbe49454850d59f1a49a0460e590db71c/src/arch/xtensa/hal/set_region_translate.c

diff -u set_region_translate.c src/arch/xtensa/hal/set_region_translate.c 

@plbossart
Copy link
Member Author

git hash-object src/arch/xtensa/hal/set_region_translate.c
27ed6b80a50b1b89f7f8b7653355f25ad0cb9932

md5sum src/arch/xtensa/hal/set_region_translate.c
36cae0b29a2c1b3a65f1e9dbb3bb829b  src/arch/xtensa/hal/set_region_translate.c

git cat-file -p 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932 | md5sum
fatal: Not a valid object name 27ed6b80a50b1b89f7f8b7653355f25ad0cb9932
d41d8cd98f00b204e9800998ecf8427e  -


git cat-file -p b1b53ed4ab216f6a0c8e7c628d93de627ac370b1 | md5sum
36cae0b29a2c1b3a65f1e9dbb3bb829b  -

diff -u set_region_translate.c src/arch/xtensa/hal/set_region_translate.c << no diff

@plbossart
Copy link
Member Author

looks like git hash-object provides a different value for the same file?

@plbossart
Copy link
Member Author

I don't know how this would impact the signature though? The sha1 used for the signature should only work with the binary itself.

@plbossart
Copy link
Member Author

@plbossart
Copy link
Member Author

Bingo!

git hash-object --no-filters src/arch/xtensa/hal/set_region_translate.c
b1b53ed4ab216f6a0c8e7c628d93de627ac370b1

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

I don't know how this would impact the signature though? The sha1 used for the signature should only work with the binary itself.

Agreed, this should not affect the rest of the build. It's still a serious bug though because: 1. it breaks the logger dictionary checksum; 2. it makes troubleshooting other issues much more complicated.

@plbossart
Copy link
Member Author

after breaking audio since 1997, I just started a new career with crypto. Bitcoin, here I come :-)

@marc-hb
Copy link
Contributor

marc-hb commented Jun 14, 2022

OK, these two files and only these two have Windows end of lines:

find * -exec dos2unix {} \; # DONT DO THIS AT HOME
git diff --stat
 src/arch/xtensa/hal/set_region_translate.c  | 1068 +++++++++++++--------------
 src/arch/xtensa/include/xtensa/c6x-compat.h | 3516 +++++++++++++++++++++++++++++++++++++++++++--------------------------------------------
 2 files changed, 2292 insertions(+), 2292 deletions(-)

core.autocrlf is evil, never use it. Use a decent, polyglot editor instead.

Some repos have .sh and .bat files in the same repo, how does core.autocrlf support that? It does not! Don't use it, it's evil.

@aiChaoSONG
Copy link

aiChaoSONG commented Jun 15, 2022

@marc-hb

There is an assertion failure in the rimage.

sof_ri_info for none -d built sof.ri
➜  sof git:(main) ✗ ./tools/sof_ri_info/sof_ri_info.py build_tgl_xcc/sof.ri 
SOF Binary build_tgl_xcc/sof.ri size 0x82300

  Extended Manifest ver 1.0.0 length 768

  CSE Manifest ver 0x102 checksum 0x0 partition name ADSP

    ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
      Rsvd0 0x0
      Modulus size (dwords) 96
        6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
      Exponent size (dwords) 1
        01 00 01 00
      Signature (file offset 0x560, length 0x180)
        bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e

      Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
       name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0

      Other Extension type 0x16 file offset 0x758 length 0x68

    cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
     ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf
      IMR type 0x3
      Attributes
        d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00

    cavs0015

  cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
    HW buffers base address 0x0 length 0x0
    Load offset 0x30000

    BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
      entry point 0xb0038000 type 0x21 ( loadable LL )
      cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
      .text   0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 )
      .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 )
      .bss    0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 )

    BASEFW    0e398c32-5ade-ba4b-93b1-c50432280ee4
      entry point 0xbe02c400 type 0x21 ( loadable LL )
      cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
      .text   0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 )
      .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
      .bss    0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 )

Memory layout undefined
check rimage signing process with valgrind
➜  sof git:(main) ✗ valgrind ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c rimage/config/tgl.toml -s 1344 -k keys/otc_private_key_3k.pem -i 3 -f 0.0.0 -b 0 -e build_tgl_xcc/src/arch/xtensa/bootloader-tgl build_tgl_xcc/sof
==1333350== Memcheck, a memory error detector
==1333350== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1333350== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1333350== Command: ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c rimage/config/tgl.toml -s 1344 -k keys/otc_private_key_3k.pem -i 3 -f 0.0.0 -b 0 -e build_tgl_xcc/src/arch/xtensa/bootloader-tgl build_tgl_xcc/sof
==1333350== 

Module Reading build_tgl_xcc/src/arch/xtensa/bootloader-tgl
info: ignore .bss section for bootloader module
  Found 18 sections, listing valid sections......
        No      LMA             VMA             End             Size    Type    Name
        0       0x00000000      0x00000000      0x00000000      0x0
        1       0xb0038000      0xb0038000      0xb00380fe      0xfe    TEXT    .boot_entry.text
        2       0xb0038120      0xb0038120      0xb0038140      0x20    TEXT    .boot_entry.literal
        3       0xb0038150      0xb0038150      0xb0038cb9      0xb69   TEXT    .text
        4       0xb0039000      0xb0039000      0xb0039010      0x10    DATA    .rodata

 module: input size 3223 (0xc97) bytes 4 sections
 module: text 3207 (0xc87) bytes
    data 16 (0x10) bytes
    bss  0 (0x0) bytes


Module Reading build_tgl_xcc/sof
  Found 43 sections, listing valid sections......
        No      LMA             VMA             End             Size    Type    Name
        2       0xbe00c000      0xbe00c000      0xbe02c000      0x20000 HEAP    .buffer_hp_heap
        3       0xbe004000      0xbe004000      0xbe006000      0x2000  HEAP    .wnd0
        4       0xbe006000      0xbe006000      0xbe008000      0x2000  HEAP    .wnd1
        5       0xbe008000      0xbe008000      0xbe00a000      0x2000  HEAP    .wnd2
        6       0xbe00a000      0xbe00a000      0xbe00c000      0x2000  HEAP    .wnd3
        7       0xbe02c000      0xbe02c000      0xbe02c16a      0x16a   TEXT    .WindowVectors.text
        8       0xbe02c180      0xbe02c180      0xbe02c186      0x6     TEXT    .Level2InterruptVector.text
        9       0xbe02c240      0xbe02c240      0xbe02c246      0x6     TEXT    .Level5InterruptVector.text
        10      0xbe02c280      0xbe02c280      0xbe02c286      0x6     TEXT    .DebugExceptionVector.text
        11      0xbe02c2c0      0xbe02c2c0      0xbe02c2c3      0x3     TEXT    .NMIExceptionVector.text
        12      0xbe02c300      0xbe02c300      0xbe02c306      0x6     TEXT    .KernelExceptionVector.text
        13      0xbe02c338      0xbe02c338      0xbe02c33c      0x4     TEXT    .UserExceptionVector.literal
        14      0xbe02c340      0xbe02c340      0xbe02c357      0x17    TEXT    .UserExceptionVector.text
        15      0xbe02c3c0      0xbe02c3c0      0xbe02c3c6      0x6     TEXT    .DoubleExceptionVector.text
        16      0xbe02c400      0xbe02c400      0xbe059efc      0x2dafc TEXT    .text
        18      0xbe059f00      0xbe800000      0xbe800120      0x120   TEXT    .AlternateResetVector.text
        19      0xbe05a020      0xbe800180      0xbe800190      0x10    TEXT    .AlternateResetL2IntVector.text
        20      0xbe05a030      0xbe800190      0xbe800270      0xe0    TEXT    .LpsramCode.text
        21      0xbe05b000      0xbe05b000      0xbe077ecc      0x1cecc DATA    .rodata
        22      0xbe077ecc      0xbe077ecc      0xbe077f08      0x3c    DATA    .module_init
        23      0xbe077f40      0xbe077f40      0xbe0a2f40      0x2b000 DATA    .shared_data
        24      0xbe0a2f40      0xbe0a2f40      0xbe0a3ef8      0xfb8   DATA    .data
        25      0xbe0a3ef8      0xbe0a3ef8      0xbe0a3f64      0x6c    DATA    .fw_ready
        26      0xbe0a3f68      0xbe0a3f68      0xbe0a3f90      0x28    DATA    .AltBootManifest
        27      0xbe0a4000      0xbe0a4000      0xbe2e0000      0x23c000        BSS     .bss

 module: input size 486924 (0x76e0c) bytes 27 sections
 module: text 188088 (0x2deb8) bytes
    data 298836 (0x48f54) bytes
    bss  2506752 (0x264000) bytes

Module Write: build_tgl_xcc/src/arch/xtensa/bootloader-tgl
 Manifest module metadata section at index 14
 Entry point 0xb0038000

        Totals  Start           End             Size
        TEXT    0xb0038000      0xb0038cb9      0xcb9
        DATA    0xb0039000      0xb0039010      0x10
        BSS     0x00000000      0x00000000      0x0

        No      Address         Size            File    Type
        1       0xb0038000      0xfe            0x8000  TEXT
        2       0xb0038120      0x20            0x8120  TEXT
        3       0xb0038150      0xb69           0x8150  TEXT
        4       0xb0039000      0x10            0x9000  DATA

 Total pages text 1 data 1 bss 0 module file limit: 0xa000

Module Write: build_tgl_xcc/sof
warning: can't find section named '.module' in module build_tgl_xcc/sof
Firmware completing manifest v2.5
 meta: completing ADSP manifest
 meta: limit is 0x1ac0
rimage: /home/chao/work/sof/rimage/src/hash.c:103: ri_sha384: Assertion `(uint64_t)size + offset <= image->adsp->image_size' failed.
==1333350== 
==1333350== Process terminating with default action of signal 6 (SIGABRT)
==1333350==    at 0x4D5CA7C: __pthread_kill_implementation (pthread_kill.c:44)
==1333350==    by 0x4D5CA7C: __pthread_kill_internal (pthread_kill.c:78)
==1333350==    by 0x4D5CA7C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==1333350==    by 0x4D08475: raise (raise.c:26)
==1333350==    by 0x4CEE7F2: abort (abort.c:79)
==1333350==    by 0x4CEE71A: __assert_fail_base.cold (assert.c:92)
==1333350==    by 0x4CFFE95: __assert_fail (assert.c:101)
==1333350==    by 0x10CB13: ri_sha384 (hash.c:103)
==1333350==    by 0x11158A: man_write_fw_meu_v2_5 (manifest.c:1214)
==1333350==    by 0x114FB2: main (rimage.c:208)
==1333350== 
==1333350== HEAP SUMMARY:
==1333350==     in use at exit: 3,362,975 bytes in 3,035 blocks
==1333350==   total heap usage: 5,071 allocs, 2,036 frees, 3,476,232 bytes allocated
==1333350== 
==1333350== LEAK SUMMARY:
==1333350==    definitely lost: 0 bytes in 0 blocks
==1333350==    indirectly lost: 0 bytes in 0 blocks
==1333350==      possibly lost: 0 bytes in 0 blocks
==1333350==    still reachable: 3,362,975 bytes in 3,035 blocks
==1333350==         suppressed: 0 bytes in 0 blocks
==1333350== Rerun with --leak-check=full to see details of leaked memory
==1333350== 
==1333350== For lists of detected and suppressed errors, rerun with: -s
==1333350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[1]    1333350 IOT instruction (core dumped)  valgrind ./build_tgl_xcc/rimage_ep/build/rimage -o sof-tgl.ri -c  -s 1344 -k 

@marc-hb
Copy link
Contributor

marc-hb commented Jun 15, 2022

Thanks @aiChaoSONG , does this assert fail only when running with valgrind? (and only with newer OpenSSL?)

sof/rimage/src/hash.c:103: ri_sha384: Assertion `(uint64_t)size + offset <= image->adsp->image_size' failed.

==1333350==    by 0x10CB13: ri_sha384 (hash.c:103)
==1333350==    by 0x11158A: man_write_fw_meu_v2_5 (manifest.c:1214)
==1333350==    by 0x114FB2: main (rimage.c:208)

@marc-hb
Copy link
Contributor

marc-hb commented Jun 15, 2022

./tools/sof_ri_info/sof_ri_info.py build_tgl_xcc/sof.ri

Below is the diff -wb -U10 between my output and @aiChaoSONG 's. The signature randomness is expected, the other differences most likely not.

I ran the same command on @fredoh9's build and only his signature differs with mine.

@plbossart can you please run this sof_ri_info.py command and compare?

If there is some memory corruption with rimage+openssl3 then all bets are off, anything can happen.

--- mine	2022-06-15 10:44:00.308980601 -0700
+++ chao	2022-06-15 10:44:16.514391243 -0700
@@ -1,47 +1,47 @@
-SOF Binary build_tgl_xcc/sof.ri size 0x81300
+SOF Binary build_tgl_xcc/sof.ri size 0x82300
 
   Extended Manifest ver 1.0.0 length 768
 
   CSE Manifest ver 0x102 checksum 0x0 partition name ADSP
 
     ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
       Rsvd0 0x0
       Modulus size (dwords) 96
         6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
       Exponent size (dwords) 1
         01 00 01 00
       Signature (file offset 0x560, length 0x180)
-        8c 42 36 21 c1 5c 5a e6 ... 02 83 81 7f 4c af 01 5a
+        bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e
 
       Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
        name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0
 
       Other Extension type 0x16 file offset 0x758 length 0x68
 
     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0xfe15179d limit offset 0x5b667f21
+     ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf
       IMR type 0x3
       Attributes
-        9c 06 84 92 54 50 c5 49 00 20 00 00 c0 2a 08 00
+        d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00
 
     cavs0015
 
   cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
     HW buffers base address 0x0 length 0x0
     Load offset 0x30000
 
     BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
       entry point 0xb0038000 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
       .text   0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 )
       .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 )
       .bss    0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 )
 
     BASEFW    0e398c32-5ade-ba4b-93b1-c50432280ee4
       entry point 0xbe02c400 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
-      .text   0xbe02c000 file offset 0xa000 flags 0x2e001f ( contents alloc load readonly code type=0 pages=46 )
-      .rodata 0xbe05a000 file offset 0x38000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
-      .bss    0xbe0a3000 file offset 0x0 flags 0x23d0202 ( alloc type=2 pages=573 )
+      .text   0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 )
+      .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
+      .bss    0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 )
 
 Memory layout undefined

@marc-hb
Copy link
Contributor

marc-hb commented Jun 15, 2022

@plbossart please also try this

--- a/src/arch/xtensa/CMakeLists.txt
+++ b/src/arch/xtensa/CMakeLists.txt
@@ -127,7 +127,7 @@ separate_arguments(EXTRA_CFLAGS_AS_LIST  NATIVE_COMMAND  ${EXTRA_CFLAGS})
 # de-duplication "feature"
 target_compile_options(sof_options INTERFACE
        $<$<COMPILE_LANGUAGE:C>:
-               -${optimization_flag} -g
+               -${optimization_flag} -g0
                -Wall -Werror
                -Wl,-EL
                -Wmissing-prototypes
@@ -449,11 +449,11 @@ if(MEU_PATH OR DEFINED MEU_NO_SIGN) # Don't sign with rimage
 
        # Passing -s ${MEU_OFFSET} disables rimage signing and produces
        # one .uns file and one .met file instead of a .ri file.
        add_custom_target(
                run_rimage
-               COMMAND ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
+               COMMAND valgrind ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
                        -o sof-${fw_name}.ri
                        -c "${PROJECT_SOURCE_DIR}/rimage/config/${fw_name}.toml"
                        -s ${MEU_OFFSET}
                        -k ${RIMAGE_PRIVATE_KEY}
                        -i ${RIMAGE_IMR_TYPE}
@@ -491,11 +491,11 @@
                )
        endif()
 else() # sign with rimage
        add_custom_target(
                run_rimage
-               COMMAND ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
+               COMMAND valgrind ${PROJECT_BINARY_DIR}/rimage_ep/build/rimage
                        -o sof-${fw_name}.ri
                        -c "${PROJECT_SOURCE_DIR}/rimage/config/${fw_name}.toml"
                        -k ${RIMAGE_PRIVATE_KEY}
                        -i ${RIMAGE_IMR_TYPE}
                        -f ${SOF_MAJOR}.${SOF_MINOR}.${SOF_MICRO}

EDIT: not using valgrind is a waste of time.

  • when there is no memory corruption valgrind makes no functional difference. It just makes the build a bit slower
  • when there is memory corruption (frequently with rimage, check the git log), using valgrind makes a HUGE difference. It makes the build much more deterministic.

Bonus feature: adding valgrind is the easiest way to print the rimage command line whereas a verbose build is crazy verbose.

@juimonen
Copy link

@aiChaoSONG your last parameter to rimage should be:
build_tgl_xcc/src/arch/xtensa/build_tgl_xcc/sof-tgl

it is now wrong and you are bailing out even before signing...

with last parameter fixed I can sign with openssl3 and valgrind looks clean (I think @marc-hb also found that).

so we really need to look at the signing differences in the image.

@plbossart
Copy link
Member Author

plbossart commented Jun 15, 2022

@plbossart can you please run this sof_ri_info.py command and compare?

@marc-hb I am afraid I have another signature, not the same as @aiChaoSONG

sof_ri_info
SOF Binary build_tgl_xcc/sof.ri size 0x82300

  Extended Manifest ver 1.0.0 length 768

  CSE Manifest ver 0x102 checksum 0x0 partition name ADSP

    ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
      Rsvd0 0x0
      Modulus size (dwords) 96
        6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
      Exponent size (dwords) 1
        01 00 01 00
      Signature (file offset 0x560, length 0x180)
        bc 18 72 5e 87 53 70 06 ... 2c 6c 20 cd c9 43 ab 72

      Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
       name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0

      Other Extension type 0x16 file offset 0x758 length 0x68

    cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
     ver 0x0 base offset 0x97e32029 limit offset 0xc60fedb4
      IMR type 0x3
      Attributes
        5a 9c b4 4a 6b e7 62 03 00 20 00 00 c0 3a 08 00

    cavs0015

  cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
    HW buffers base address 0x0 length 0x0
    Load offset 0x30000

    BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
      entry point 0xb0038000 type 0x21 ( loadable LL )
      cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
      .text   0xb0038000 file offset 0x8000 flags 0x1001f ( contents alloc load readonly code type=0 pages=1 )
      .rodata 0xb0039000 file offset 0x9000 flags 0x1012f ( contents alloc load readonly data type=1 pages=1 )
      .bss    0x0 file offset 0x0 flags 0xf00 ( type=15 pages=0 )

    BASEFW    0e398c32-5ade-ba4b-93b1-c50432280ee4
      entry point 0xbe02c400 type 0x21 ( loadable LL )
      cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1
      .text   0xbe02c000 file offset 0xa000 flags 0x2f001f ( contents alloc load readonly code type=0 pages=47 )
      .rodata 0xbe05b000 file offset 0x39000 flags 0x49012f ( contents alloc load readonly data type=1 pages=73 )
      .bss    0xbe0a4000 file offset 0x0 flags 0x23c0202 ( alloc type=2 pages=572 )

plb.log

--- chao.log	2022-06-15 13:51:02.061698055 -0500
+++ plb.log	2022-06-15 13:51:41.166699970 -0500
@@ -4,32 +4,32 @@
 
   CSE Manifest ver 0x102 checksum 0x0 partition name ADSP
 
     ADSP.man (CSS Manifest) type 0x4 file offset 0x35c hdr_len 900 ver 0x21000 date 2022/06/15
       Rsvd0 0x0
       Modulus size (dwords) 96
         6b 75 ed 58 20 08 85 95 ... 55 d1 7d c6 0d 79 12 a9 (Community 3k key)
       Exponent size (dwords) 1
         01 00 01 00
       Signature (file offset 0x560, length 0x180)
-        bc 82 30 c5 09 45 2d 3a ... 6e 20 78 e8 7e 30 1a 5e
+        bc 18 72 5e 87 53 70 06 ... 2c 6c 20 cd c9 43 ab 72
 
       Plat Fw Auth Extension type 0xf file offset 0x6e0 length 0x78
        name ADSP vcn 0x0 bitmap 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 svn 0x0
 
       Other Extension type 0x16 file offset 0x758 length 0x68
 
     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0x30f7628d limit offset 0xf5f479cf
+     ver 0x0 base offset 0x97e32029 limit offset 0xc60fedb4
       IMR type 0x3
       Attributes
-        d6 6c 05 2d d1 76 5c d0 00 20 00 00 c0 3a 08 00
+        5a 9c b4 4a 6b e7 62 03 00 20 00 00 c0 3a 08 00
 
     cavs0015
 
   cavs0015 (ADSP Manifest) file offset 0x2300 name ADSPFW build ver 2.0.0.1 feature mask 0xffff image flags 0x0
     HW buffers base address 0x0 length 0x0
     Load offset 0x30000
 
     BRNGUP    2b79e4f3-4675-f649-89df-3bc194a91aeb
       entry point 0xb0038000 type 0x21 ( loadable LL )
       cfg offset 0 count 0 affinity 0x3 instance max count 1 stack size 0x1

@plbossart
Copy link
Member Author

@marc-hb No real issue detected with valgrind + -g0, and no luck - same boot failure.

log.txt

@plbossart
Copy link
Member Author

plbossart commented Jun 15, 2022

FWIW on my device I have this:

openssl version
OpenSSL 3.0.2 15 Mar 2022 (Library: OpenSSL 3.0.2 15 Mar 2022)

Edit:
ldd build_tgl_xcc/rimage_ep/build/rimage
linux-vdso.so.1 (0x00007ffc7ddf5000)
libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fbe20223000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbe1fffb000)
/lib64/ld-linux-x86-64.so.2 (0x00007fbe206a0000)

@marc-hb
Copy link
Contributor

marc-hb commented Jun 15, 2022

Thanks, can you make sure you have the same reproducible.ri sha256sum as everyone else? (after disabling core.autocrlf). Copying them from my comment 13 hours ago:

XCC xtensa-build-all -d tgl

sha256sum reproducible.ri
25f6c86d52dff278c60faf378639277a2d7e22ebb822d042bbd3ae428b4656bc reproducible.ri

without -d

sha256sum reproducible.ri
df01b254e818a7fed8c5c39b1cf37a6e042c09795dcd3d38e39bfc6728623a26 reproducible.ri

If they're the same then we are 100% sure this is a pure signing or manifest issue.

@marc-hb
Copy link
Contributor

marc-hb commented Jun 15, 2022

As a workaround please try the Docker build (without XCC)

docker pull thesofproject/sof # get coffee
./scripts/docker-run.sh openssl version
 # OpenSSL 1.1.1f  31 Mar 2020

./scripts/docker-run.sh ./scripts/xtensa-build-all.sh tgl
 

@plbossart
Copy link
Member Author

yep, same reproducible -> pure signature issue.

with -d
sha256sum reproducible.ri
25f6c86d52dff278c60faf378639277a2d7e22ebb822d042bbd3ae428b4656bc reproducible.ri

without -d
sha256sum reproducible.ri
df01b254e818a7fed8c5c39b1cf37a6e042c09795dcd3d38e39bfc6728623a26 reproducible.ri

@plbossart
Copy link
Member Author

plbossart commented Jun 15, 2022

> docker pull thesofproject/sof # get coffee

instant coffee with Google Fiber :-)

> ./scripts/docker-run.sh openssl version
>  # OpenSSL 1.1.1f  31 Mar 2020
> 
> ./scripts/docker-run.sh ./scripts/xtensa-build-all.sh tgl

works fine, the firmware boots on Up Extreme11.

@aiChaoSONG
Copy link

does this assert fail only when running with valgrind

@marc-hb I think I signed the wrong binary (build_tgl_xcc/sof), that's why I have assertion failure. if I use the correct binary (build_tgl_xcc/src/arch/xtensa/sof-tgl), no failure at all. thanks Jaska to correct me.

@juimonen
Copy link

@aiChaoSONG & @plbossart can you try: #97

@aiChaoSONG
Copy link

aiChaoSONG commented Jun 16, 2022

Manually tried #97, firmware can boot on UpExtreme, file a SOF test PR to test the rimage PR on other platforms except ACE: thesofproject/sof#5928

@plbossart
Copy link
Member Author

Confirmed fix, works for me as well. Thanks @juimonen

@marc-hb
Copy link
Contributor

marc-hb commented Jun 16, 2022

@plbossart , @aiChaoSONG and anyone with openssl3, could you run sof_ri_info.py again with the tentative rimage fix and compare the output with openssl2 openssl1 again?

For openssl2 openssl1 you can either download it from here or use the Docker container.

lgirdwood referenced this issue in thesofproject/sof Jun 16, 2022
Local filters in ~/gitconfig, such as

[core]
	autocrlf = input

can impact the result of git hash-object. Make sure no filters are
used so that the hash value remains unmodified across user setups.

BugLink: https://github.com/thesofproject/sof/issues/5917
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
@aiChaoSONG
Copy link

@marc-hb marc-hb changed the title [BUG] Signature issue on TGL UP Extreme i11 [BUG] Signature issue on TGL UP Extreme i11 - openssl3 Jun 17, 2022
@marc-hb
Copy link
Contributor

marc-hb commented Jun 17, 2022

The difference between

  • Chao's gcc+ssl1 versus Chao's gcc+ssl3+rimage fix
    ... is not bigger than the difference between:
  • Chao's gcc+ssl1 versus my gcc+ssl1

So maybe everything is OK? I wish I knew or remembered why there is so much cavs0015.met (ADSP Metadata File Extension) difference but life is too short... @juimonen , @lgirdwood , any idea?

--- chao gcc_ri_info_with_ssl1.txt	2022-06-17 17:51:58.976044719 +0000
+++ chao gcc_ri_info_with_ssl3_and_rimage_fix.txt	2022-06-17 17:51:58.976044719 +0000
@@ -19,10 +19,10 @@
       Other Extension type 0x16 file offset 0x758 length 0x68
 
     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0x261b8ed7 limit offset 0xf6ab7adf
+     ver 0x0 base offset 0xedd9b8ac limit offset 0x38c0e79e
       IMR type 0x3
       Attributes
-        da e4 ae 75 c7 c7 10 be 00 20 00 00 c0 ca 06 00
+        fa db 8c d0 e8 bd 2d c3 00 20 00 00 c0 ca 06 00
 
     cavs0015
--- chao gcc_ri_info_with_ssl1.txt
+++ marc_gcc_ri_info_with_ssl1.txt
@@ -19,10 +19,10 @@
       Other Extension type 0x16 file offset 0x758 length 0x68
 
     cavs0015.met (ADSP Metadata File Extension) type 0x11 file offset 0x7c0 length 0x70
-     ver 0x0 base offset 0x261b8ed7 limit offset 0xf6ab7adf
+     ver 0x0 base offset 0x8e159352 limit offset 0xcbaeed49
       IMR type 0x3
       Attributes
-        da e4 ae 75 c7 c7 10 be 00 20 00 00 c0 ca 06 00
+        35 d2 98 c9 6a 0a c3 18 00 20 00 00 c0 ca 06 00
 
     cavs0015
 

@lgirdwood
Copy link
Member

@plbossart rimage fix now merged. Can you try and we can close if good ?

@marc-hb marc-hb mentioned this issue Jun 22, 2022
@marc-hb marc-hb transferred this issue from thesofproject/sof Jun 22, 2022
@aiChaoSONG
Copy link

#97 fix confirmed

@marc-hb
Copy link
Contributor

marc-hb commented Jan 11, 2023

Can we add --nofilters to avoid such user-level variations.

I'm reverting the --no-filters in thesofproject/sof/pull/6920 because it's not compatible with Windows. Long story in thesofproject/sof/pull/6920

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Signing
Projects
None yet
Development

No branches or pull requests

8 participants
@juimonen @plbossart @mengdonglin @lgirdwood @fredoh9 @marc-hb @aiChaoSONG and others