-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] rimage + openssl3 is broken in the stable-v2.2 and cavs25 branches, always have been #9340
Comments
So now we have (at least) two obvious options:
I could be wrong but I don't think there would be much value cherry-picking rimage commits AFTER the transfer to sof.git back into the stable-v2.2 branch. |
Fixes thesofproject#9340 This adds the following commits from the main rimage branch: ``` commit 02abc5d ("rimage: ace signing functions need openssl 3.0 guards") commit 73a9d7c ("rimage: fix openssl 3.0 support in ver25 signing") commit 8ba3d17 ("adsp_config: fix name parsing error in parse_signed_pkg_ace_v1_5") commit fe4dcaa ("mtl: Add ASRC module to the manifest") commit af947cb ("config: Add toml config for mtl") commit c484d99 ("rimage: add ACE V1.5 handling") commit 1b233f6 ("config: add rmb toml file to support rembrandt build") ``` This is the smallest main branch fast-forward that includes the critical openssl3 fix commit 73a9d7c ("rimage: fix openssl 3.0 support in ver25 signing") _and_ compiles. This happens to align the rimage version in stable-v2.2 to the version in stable-v2.3. stable-v2.3 is not in use anymore but it was routinely tested in CI for a long time. In fact this stable-v2.2 commit is the same as stable-v2.3 commit 4e1d3ba ("rimage: update to version 02abc5d") Signed-off-by: Marc Herbert <marc.herbert@intel.com>
rimage is pretty good at being backwards compatible right? How much risk is there in the fast forward in breaking something? |
Probably good (compatibility) but we have no evidence of that. There have been some big code changes later and they were just too big for me to audit them. On the other hand, I spotted no fix later as important as this one. If anyone volunteers then please free to submit a bigger update after this one. The lack of branching and cherry-picks now will make a future upgrade easy from a git perspective. |
Sorry I assumed you meant "fast forward all the way" and answered accordingly. Now I see this may or may not be what you meant. #9342 is the smallest (and small) fast forward with the fix. |
Fixes #9340 This adds the following commits from the main rimage branch: ``` commit 02abc5d ("rimage: ace signing functions need openssl 3.0 guards") commit 73a9d7c ("rimage: fix openssl 3.0 support in ver25 signing") commit 8ba3d17 ("adsp_config: fix name parsing error in parse_signed_pkg_ace_v1_5") commit fe4dcaa ("mtl: Add ASRC module to the manifest") commit af947cb ("config: Add toml config for mtl") commit c484d99 ("rimage: add ACE V1.5 handling") commit 1b233f6 ("config: add rmb toml file to support rembrandt build") ``` This is the smallest main branch fast-forward that includes the critical openssl3 fix commit 73a9d7c ("rimage: fix openssl 3.0 support in ver25 signing") _and_ compiles. This happens to align the rimage version in stable-v2.2 to the version in stable-v2.3. stable-v2.3 is not in use anymore but it was routinely tested in CI for a long time. In fact this stable-v2.2 commit is the same as stable-v2.3 commit 4e1d3ba ("rimage: update to version 02abc5d") Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb we can close this now right? |
Re-opening for cavs2.5-001-drop-stable, see random firmware boot failures in https://sof-ci.01.org/sofpr/PR9427/build7557/devicetest/index.html Still P1 because firmware in cavs2.5 does not... boot. Exactly like what happened in #9336, a random, unrelated topology change is causing a "butterfly effect" in the firmware image and it stops booting. I thought I could very quickly cherry-pick the same rimage upgrade than I did for stable-v2.2 (in #9342) and that has been working well. But that rimage submodule cherry-pick has a conflict because @abonislawski pushed a cavs25-only, "file error handling" rimage fix in thesofproject/rimage@d287016. I don't know why this file error handling fix was only on cavs25 and without any PR or code review. It looks like yet another static analysis "quick fix". At this stage the best option is probably to create a new branch https://github.com/thesofproject/rimage/tree/cavs2.5-001-drop-stable-2 that points at the rimage commits used in stable-v2.2 + maybe a cherry-pick -x of ""file error handling" d287016caac ? EDIT: or much simpler: forget about "rogue", unreviewed and pointless commit d287016caac and just align cavs2.5 with stable-v2.2 |
Unfortunately this "rogue", unreviewed and pointless commit d287016caac is required for SOF releases from cavs2.5 branch. I didn't saw any conflicts and just added openssl3 fix to cavs2.5 branch, no issues in static analysis scans. https://github.com/thesofproject/sof/tree/cavs2.5-001-drop-stable/ @kv2019i not sure who is able now to verify it? |
I tested stable-v2.2 and cavs2.5-001-drop-stable both today and works after the backport. Closing. |
Describe the bug
rimage + openssl3 is broken in the
stable-v2.2 andcavs-drop-stable branch, always has been.This was fixed 2 years ago by rimage commit thesofproject/rimage@73a9d7c
However that critical rimage fix never made it to https://github.com/thesofproject/sof/commits/stable-v2.2/rimage. stable-v2.2 fell 6 commits short.
That's because we tested https://github.com/thesofproject/sof/commits/stable-v2.3/rimage for a long time (which does have the openssl3 fix) and then decided to go backwards to stable-v2.2. I don't remember why.
In thesofproject/rimage#97 (review) @lgirdwood wrote "We need to take this for v2.2, ..." but that never happened.
The original investigation happened in 2022 in https://github.com/thesofproject/sof/issues/5917 . I didn't re-open that bug and opened a new one instead because that investigation took an incredibly long time with many red herrings so the original bug is very long and very hard to follow. One of the reasons it took so long: some people were using openssl1 (which worked) and other people were using openssl3 (which failed) but it took a long time to notice that difference.
To Reproduce
This affects only some SOF commits and not others and I have no idea why. Any OpenSSL expert around?
This was found in totally unrelated PR #9336. When I fetch 4ed65ed which is the latest
pull/9336/merge
commit for that PR and rungit submodule update --recursive
, then the firmware never boots on my TGL Xtreme. After I cherry-pick openssl3 fix thesofproject/rimage@73a9d7c, the firmware boots again every time.(the git bisect wasn't straight-forward for unrelated reasons but it worked; you have been warned)
Reproduction Rate
100% for some SOF commits but 0% for other SOF commits like the current tag
v2.2.10
. No idea why some commits boot while others not.The toolchain can also affect reproduction, see comments in unrelated #9336. I guess anything that changes the binary.
Expected behavior
The firmware boots.
Impact
Show stopper.
Screenshots or console output
The text was updated successfully, but these errors were encountered: