Verify builds are reproducible in the CI #50205

keith-zephyr · 2022-09-13T23:03:52Z

Introduction

Zephyr builds should be reproducible. A checkout of Zephyr from the same commit, built with the same toolchain, should generate an identical image binary.

Problem description

This has been proposed before (#11523 and #14593). But there are no tests that verify reproducible build in the Zephyr tree at the moment.

Furthermore, reproducible builds were broken for an unknown amount of time, but fixed with #48195.

Proposed change

Add a new github workflow that verifies builds are reproducible. This workflow will be run on every PR.

The workflow can follow the blueprint of the Footprint Delta workflow. The new workflow would build TBD platforms, back to back, verifying the resulting binaries are identical.

Note that the build command west build -b native_posix tests/drivers/build_all/sensor has been known to catch problems with devicetree generation that results in non-reproducible builds.

Dependencies

The new github workflow will block new PRs if the reproducible build test fails.

Concerns and Unresolved Questions

Running this check against every PR will incur additional computing time and resources.

Alternatives

Run the reproducible build check less frequently, such as nightly. However, this will require a significant bisect effort to identify the culprit PR when any failures are detected. The incremental cost of some additional builds on each PR seems worth the trouble.

The text was updated successfully, but these errors were encountered:

stephanosio · 2022-09-14T07:32:37Z

cc @marc-hb

stephanosio · 2022-09-14T07:40:13Z

This workflow will be run on every PR.

I do not expect this to be something that breaks often. Bi-weekly build should be fine.

gmarull · 2022-09-14T13:44:20Z

We have many non-locked Python dependencies that are used somehow during the build process, they should be considered.

marc-hb · 2022-09-14T16:10:26Z

Here's a list of 20+ old reproducibility fixes:

meta issue for reproducible builds (was: tests/determinism.sh) #14593

This should show what the most common problems are.

In the same place there's an (obsolete) test script. The approach was crude but very effective:

build once
change as many things as possible
build again
diffoscope

marc-hb · 2022-09-14T17:22:05Z

I do not expect this to be something that breaks often.

Agreed. Reproducibility testing and fixing is rare, but reproducibility regressions are very rare too.

Bi-weekly build should be fine.

On the other hand, IF it's cheap and quick to run then why not run it every PR?

keith-zephyr · 2022-09-16T15:03:29Z

Because of the amount of generated code, I'm in favor of checking on every PR. Maybe the github workflow can be setup to run on any changes to the ./scripts directory, but also setup as a weekly run to catch problems with the actual source code.

Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr build is most of the time reproducible: zephyrproject-rtos#50205 and zephyrproject-rtos#14593. This means two different build machines using the same toolchain will always produce the same binary output. The one-line addition in this commit makes it trivial to verify that binary outputs are indeed the same by adding a single checksum line in the build logs: ``` [16/16] Linking C executable zephyr/zephyr.elf Memory region Used Size Region Size %age Used RAM: 53280 B 3 MB 1.69% IDT_LIST: 0 GB 2 KB 0.00% fdd2ddf2ad7d5da5bbd79b41cef...7b16ef549a8281111d8e205 zephyr.strip ``` This commit makes a non-measurable build time difference. Build reproducibility matters for (at least) two important reasons: - Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205, https://reproducible-builds.org/ and many others. - Making sure build configurations are strictly identical when trying to reproduce elusive issues or when issuing releases. Displaying a reproducible checksum accelerates the investigation of temporary reproducibility issues like zephyrproject-rtos#48195. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr build is reproducible most of the time: zephyrproject-rtos#50205 and zephyrproject-rtos#14593 This means two different build machines using the same toolchain will always produce the same binary output. The previous, one-line commit made it trivial to verify that binary outputs are indeed the same by adding this single line in the buid logs: ``` [16/16] Linking C executable zephyr/zephyr.elf Memory region Used Size Region Size %age Used RAM: 53280 B 3 MB 1.69% IDT_LIST: 0 GB 2 KB 0.00% fdd2ddf2ad7d5da5bbd79b41cef8d7...1a896b989a8281111d8e205 zephyr.strip ``` This commit enables that feature by default because build reproducibility matters for (at least) two important reasons: - Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205, https://reproducible-builds.org/ and many others. - Making sure build configurations are strictly identical when trying to reproduce elusive issues or when issuing releases. It was of course already possible to _manually_ make this Kconfig change and manually compute this checksum. However this can be impossible when dealing with an automated build system that does not archive all _intermediate_ (zephyrproject-rtos#5009) files like `zephyr.elf`. Tweaking the build configuration can also be difficult and error-prone for people who are not Zephyr developers. Most automated CI systems preserve build logs by default. Displaying the reproducible checksum by default accelerates the discovery of reproducibility bugs like zephyrproject-rtos#48195. When measured with `west build -p -b qemu_x86 samples/hello_world/`, the additional `build/zephyr/zephyr.strip` disk space required is 43 kilobytes compared to a total of 11 Megabytes. Measuring a more realistic SOF example, `zephyr.strip` weighed 690 kb which was about 0.1% of a total `build/` directory weighing 65M. To measure the build time cost I ran `west build -p -b qemu_x86 samples/hello_world/` many times in a loop with and without this PR on my Linux workstation. Stripping and checksumming made literally no time difference compared to the "noise" observed when building the same configuration. This is not surprising considering how small `zephyr.strip`: so the extra cost is most likely dominated by process creation and the total number of processes created during a Zephyr build dwarfs the few extra processes required by this feature. More surprisingly, I measured incremental builds by running `touch kernel/timer.c; west build ...` in a loop and I could not observe any visible time difference either. Signed-off-by: Marc Herbert <marc.herbert@intel.com>

marc-hb · 2022-11-04T03:19:39Z

These 2 additional lines are IMHO a big step forward, please help review:

cmake: compute and display the reproducible checksum by default #51954

marc-hb · 2023-04-11T04:39:52Z

Github Actions for the Zephyr+SOF project have been routinely and successfully comparing binaries built on Linux versus Windows in every PR for a few months now:

To achieve this I overrode the default config change in #51954 in an SOF-specific way: thesofproject/sof@945adb8d1660ed4

Building across two different operating systems provides a lot of differences "for free" that can be very difficult to achieve on the same operating system (see old #14593 attempt). Kudos to @aborisovich for implementing the Windows build in Github Actions.

This does not catch everything (e.g.: __DATE__) but it indirectly provides reproducibility coverage for a lot of the Zephyr project.

Note a build is no more "reproducible" than a project is "bug-free"; fixing reproducibility bugs is a continuous activity exactly like fixing other bugs. Typically, building some code is reproducible in some Kconfiguration but fails when that Kconfiguration is changed - exactly like other bugs. Most recent example with CONFIG_ASSERT:

Switching to an old toolchain can also be very problematic:

XCC RG-2017.8-linux: generated object code is affected by -g2 level (the default level) + longer source paths thesofproject/sof#7114

keith-zephyr added the RFC Request For Comments: want input from the community label Sep 13, 2022

henrikbrixandersen added the area: Continuous Integration label Sep 14, 2022

paperclip4465 mentioned this issue Sep 16, 2022

Generated linker scripts break when ZEPHYR_BASE and ZEPHYR_MODULES share structure that contains symlinks #50284

Closed

keith-zephyr mentioned this issue Sep 21, 2022

Process: release criteria for v3.3 and later #46759

Closed

marc-hb mentioned this issue Nov 4, 2022

cmake: compute and display the reproducible checksum by default #51954

Closed

This was referenced Jan 1, 2023

Extension commands silently missing when git isn't installed zephyrproject-rtos/west#617

Closed

cmake: prototyping support for CMake presets json file. nrfconnect/sdk-nrf#3979

Closed

nashif added this to RFC Backlog Apr 13, 2023

MaureenHelm assigned keith-zephyr Apr 20, 2023

marc-hb mentioned this issue Jun 27, 2024

cmake: fix relative path calculate error #74710

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify builds are reproducible in the CI #50205

Verify builds are reproducible in the CI #50205

keith-zephyr commented Sep 13, 2022

stephanosio commented Sep 14, 2022

stephanosio commented Sep 14, 2022

gmarull commented Sep 14, 2022

marc-hb commented Sep 14, 2022 •

edited

Loading

marc-hb commented Sep 14, 2022

keith-zephyr commented Sep 16, 2022

marc-hb commented Nov 4, 2022

marc-hb commented Apr 11, 2023 •

edited

Loading

Verify builds are reproducible in the CI #50205

Verify builds are reproducible in the CI #50205

Comments

keith-zephyr commented Sep 13, 2022

Introduction

Problem description

Proposed change

Dependencies

Concerns and Unresolved Questions

Alternatives

stephanosio commented Sep 14, 2022

stephanosio commented Sep 14, 2022

gmarull commented Sep 14, 2022

marc-hb commented Sep 14, 2022 • edited Loading

marc-hb commented Sep 14, 2022

keith-zephyr commented Sep 16, 2022

marc-hb commented Nov 4, 2022

marc-hb commented Apr 11, 2023 • edited Loading

marc-hb commented Sep 14, 2022 •

edited

Loading

marc-hb commented Apr 11, 2023 •

edited

Loading