Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify builds are reproducible in the CI #50205

Open
keith-zephyr opened this issue Sep 13, 2022 · 8 comments
Open

Verify builds are reproducible in the CI #50205

keith-zephyr opened this issue Sep 13, 2022 · 8 comments
Assignees
Labels
area: Continuous Integration RFC Request For Comments: want input from the community

Comments

@keith-zephyr
Copy link
Contributor

Introduction

Zephyr builds should be reproducible. A checkout of Zephyr from the same commit, built with the same toolchain, should generate an identical image binary.

Problem description

This has been proposed before (#11523 and #14593). But there are no tests that verify reproducible build in the Zephyr tree at the moment.

Furthermore, reproducible builds were broken for an unknown amount of time, but fixed with #48195.

Proposed change

Add a new github workflow that verifies builds are reproducible. This workflow will be run on every PR.

The workflow can follow the blueprint of the Footprint Delta workflow. The new workflow would build TBD platforms, back to back, verifying the resulting binaries are identical.

Note that the build command west build -b native_posix tests/drivers/build_all/sensor has been known to catch problems with devicetree generation that results in non-reproducible builds.

Dependencies

The new github workflow will block new PRs if the reproducible build test fails.

Concerns and Unresolved Questions

Running this check against every PR will incur additional computing time and resources.

Alternatives

Run the reproducible build check less frequently, such as nightly. However, this will require a significant bisect effort to identify the culprit PR when any failures are detected. The incremental cost of some additional builds on each PR seems worth the trouble.

@keith-zephyr keith-zephyr added the RFC Request For Comments: want input from the community label Sep 13, 2022
@stephanosio
Copy link
Member

cc @marc-hb

@stephanosio
Copy link
Member

This workflow will be run on every PR.

I do not expect this to be something that breaks often. Bi-weekly build should be fine.

@gmarull
Copy link
Member

gmarull commented Sep 14, 2022

We have many non-locked Python dependencies that are used somehow during the build process, they should be considered.

@marc-hb
Copy link
Collaborator

marc-hb commented Sep 14, 2022

Here's a list of 20+ old reproducibility fixes:

This should show what the most common problems are.

In the same place there's an (obsolete) test script. The approach was crude but very effective:

  • build once
  • change as many things as possible
  • build again
  • diffoscope

@marc-hb
Copy link
Collaborator

marc-hb commented Sep 14, 2022

I do not expect this to be something that breaks often.

Agreed. Reproducibility testing and fixing is rare, but reproducibility regressions are very rare too.

Bi-weekly build should be fine.

On the other hand, IF it's cheap and quick to run then why not run it every PR?

@keith-zephyr
Copy link
Contributor Author

Because of the amount of generated code, I'm in favor of checking on every PR. Maybe the github workflow can be setup to run on any changes to the ./scripts directory, but also setup as a weekly run to catch problems with the actual source code.

marc-hb added a commit to marc-hb/zephyr that referenced this issue Nov 4, 2022
Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr
build is most of the time reproducible: zephyrproject-rtos#50205 and zephyrproject-rtos#14593.

This means two different build machines using the same toolchain will
always produce the same binary output. The one-line addition in this
commit makes it trivial to verify that binary outputs are indeed the
same by adding a single checksum line in the build logs:

```
[16/16] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
             RAM:       53280 B         3 MB      1.69%
        IDT_LIST:          0 GB         2 KB      0.00%

fdd2ddf2ad7d5da5bbd79b41cef...7b16ef549a8281111d8e205  zephyr.strip
```

This commit makes a non-measurable build time difference.

Build reproducibility matters for (at least) two important reasons:

- Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205,
  https://reproducible-builds.org/ and many others.
- Making sure build configurations are strictly identical when trying to
  reproduce elusive issues or when issuing releases.

Displaying a reproducible checksum accelerates the investigation of
temporary reproducibility issues like zephyrproject-rtos#48195.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
marc-hb added a commit to marc-hb/zephyr that referenced this issue Nov 4, 2022
Temporary bugs, corner cases and obsolete toolchains aside, the Zephyr
build is reproducible most of the time: zephyrproject-rtos#50205 and zephyrproject-rtos#14593

This means two different build machines using the same toolchain will
always produce the same binary output. The previous, one-line commit made
it trivial to verify that binary outputs are indeed the same by adding this
single line in the buid logs:

```
[16/16] Linking C executable zephyr/zephyr.elf
Memory region         Used Size  Region Size  %age Used
             RAM:       53280 B         3 MB      1.69%
        IDT_LIST:          0 GB         2 KB      0.00%
fdd2ddf2ad7d5da5bbd79b41cef8d7...1a896b989a8281111d8e205  zephyr.strip
```

This commit enables that feature by default because build
reproducibility matters for (at least) two important reasons:

- Security / supply chain attacks, see https://www.cisa.gov/sbom, zephyrproject-rtos#50205,
  https://reproducible-builds.org/ and many others.
- Making sure build configurations are strictly identical when trying to
  reproduce elusive issues or when issuing releases.

It was of course already possible to _manually_ make this Kconfig change
and manually compute this checksum. However this can be impossible when
dealing with an automated build system that does not archive all
_intermediate_ (zephyrproject-rtos#5009) files like `zephyr.elf`. Tweaking the build
configuration can also be difficult and error-prone for people who are
not Zephyr developers.

Most automated CI systems preserve build logs by default.

Displaying the reproducible checksum by default accelerates the
discovery of reproducibility bugs like zephyrproject-rtos#48195.

When measured with `west build -p -b qemu_x86 samples/hello_world/`, the
additional `build/zephyr/zephyr.strip` disk space required is 43
kilobytes compared to a total of 11 Megabytes. Measuring a more
realistic SOF example, `zephyr.strip` weighed 690 kb which was about
0.1% of a total `build/` directory weighing 65M.

To measure the build time cost I ran `west build -p -b qemu_x86
samples/hello_world/` many times in a loop with and without this PR on
my Linux workstation. Stripping and checksumming made literally no time
difference compared to the "noise" observed when building the same
configuration. This is not surprising considering how small
`zephyr.strip`: so the extra cost is most likely dominated by process
creation and the total number of processes created during a Zephyr build
dwarfs the few extra processes required by this feature.

More surprisingly, I measured incremental builds by running `touch
kernel/timer.c; west build ...` in a loop and I could not observe any
visible time difference either.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb
Copy link
Collaborator

marc-hb commented Nov 4, 2022

These 2 additional lines are IMHO a big step forward, please help review:

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 11, 2023

Github Actions for the Zephyr+SOF project have been routinely and successfully comparing binaries built on Linux versus Windows in every PR for a few months now:

To achieve this I overrode the default config change in #51954 in an SOF-specific way: thesofproject/sof@945adb8d1660ed4

Building across two different operating systems provides a lot of differences "for free" that can be very difficult to achieve on the same operating system (see old #14593 attempt). Kudos to @aborisovich for implementing the Windows build in Github Actions.

This does not catch everything (e.g.: __DATE__) but it indirectly provides reproducibility coverage for a lot of the Zephyr project.

Note a build is no more "reproducible" than a project is "bug-free"; fixing reproducibility bugs is a continuous activity exactly like fixing other bugs. Typically, building some code is reproducible in some Kconfiguration but fails when that Kconfiguration is changed - exactly like other bugs. Most recent example with CONFIG_ASSERT:

Switching to an old toolchain can also be very problematic:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Continuous Integration RFC Request For Comments: want input from the community
Projects
Status: No status
Development

No branches or pull requests

5 participants