-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for generating buildinfo file as subpackage #1532
Conversation
TODO: - Add 'Binary' field with all built RPMs package names - Add Checksums for at least source RPM - Add additional rpm macros (--define, --with/--without, etc.)
- Few refactor - Handle detached RPM sig from Koji Related upstream change: rpm-software-management/rpm#1532
- Few refactor - Handle detached RPM sig from Koji - Related RPM change: rpm-software-management/rpm#1532
Just a little ping here to know if anyone has already take a look on the global form for this. @Conan-Kudo any feedback if you have time? Thank you very much. |
If you want feedback here, you should explain what it is and why we should care. Outside references are okay for additional background, but the what it does and why (as in rationale for the whole) needs to be explained right here so that it can be discussed. So start by explaining what problem you're trying to solve, and not how other people might've chosen to solve it. |
The purpose of having an automatically generated subpackage Having this, we can rebuild in the same condition any For example, https://github.com/fepitre/rpmreproduce/blob/master/tests/data/terminator-buildinfo-2.1.0-1.fc32.noarch.rpm is a generated buildinfo package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid using “whitelist” and properly quote variables containing special characters
scripts/rpmbuildinfo
Outdated
rpm -qa --queryformat '%{epoch}:%{name}-%{version}-%{release}.%{arch}\n' \ | ||
| LC_ALL=C sort -t: -k2 \ | ||
| sed -e 's/^(none)://; /\.(none)$/d; s/^/ /' >> "$BUILDINFO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly this requires -o pipefail
if we want to handle errors robustly.
scripts/rpmbuildinfo
Outdated
Format: 1.0-rpm | ||
Build-Architecture: $(uname -m) | ||
Source: $RPM_PACKAGE_NAME | ||
Epoch: $RPM_PACKAGE_EPOCH | ||
Version: ${RPM_PACKAGE_VERSION} | ||
Release: ${RPM_PACKAGE_RELEASE} | ||
Architecture: $RPM_ARCH | ||
Build-Origin: $(getos) | ||
Build-Path: $RPM_BUILD_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.0-rpm
makes no sense in this context. You're defining an rpm-based format.
RPM convention is to use CamelCase rather than hyphenation, so please be consistent with that (e.g. BuildArchitecture
vs Build-Architecture
). Please use RPM names (BuildOS
vs Build-Origin
, BuildDir
vs Build-Path
, etc.).
Also, please consistently call shell variables with ${}
format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're defining an rpm-based format.
This is actually a debian-defined format1 and at the time we was mocking this format2 we thought it would be good to maintain compatibility, because we expect much of the tooling will be shared. Obviously things like package version specification will be RPM-specific (hence -rpm
in Format:
), but the rest, like exchange, signature verification, comparison etc. does not need to be different. So the original intention is to maintain as close format to the original as possible.
Footnotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is valuable for us, given that we have properties in RPM that don't exist in Debian, and I'd rather see the format using terminology native to RPM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also of note, the name Source:
instead of Name:
is not clear, so that clearly would also have to change, since Source
means something else entirely in RPM land.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is valuable for us, given that we have properties in RPM that don't exist in Debian
Can you please suggest which properties those are exactly, and specifically how differences in their content might affect reproducibility of the packages? The buildinfo file was purposefully designed to not include all available information, only the relevant to reproducible builds, because recording too much would not actually be useful when analysing (un)reproducibility. The underlying assumption is that the build process of a package will be made as robust as possible, that is, allow as much variability in environment as reasonably possible, which in theory should allow to record less information in buildinfo.
scripts/rpmbuildinfo
Outdated
Build-Path: $RPM_BUILD_DIR | ||
EOF | ||
|
||
printf 'Installed-Build-Depends:\n' >> "$BUILDINFO" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an appropriate name for this, since you are describing the environment. It's not just various BuildRequires
. I would suggest EnvironmentRequires
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, that's debianism (https://manpages.debian.org/bullseye/dpkg-dev/deb-buildinfo.5.en.html#Installed-Build-Depends:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'd rather not have Debian-isms in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Conan-Kudo thank you for your feedback and comments. Generally I would have loved to have a generic format not being one distro specific to ease manipulating this file among several rebuild tools but I guess it would not be straightforward. I'm waiting some feedback from @bmwiedemann too then I would propose to adapt the work according to your comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you really insist on a different format for RPM, I'd suggest the file suffix be changed to something else than .buildinfo
(maybe .rpmbuildinfo
?). This will at least make distinguishing the files easier. Archlinux has .BUILDINFO
(all caps) I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with another filename.
scripts/rpmbuildinfo
Outdated
# Toolchain. | ||
ENV_WHITELIST="$ENV_WHITELIST CC CPP CXX OBJC OBJCXX PC FC M2C AS LD AR RANLIB MAKE AWK LEX YACC" | ||
# Toolchain flags. | ||
ENV_WHITELIST="$ENV_WHITELIST CFLAGS CPPFLAGS CXXFLAGS OBJCFLAGS OBJCXXFLAGS GCJFLAGS FFLAGS LDFLAGS ARFLAGS MAKEFLAGS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will not provide any relevant information, unless you set them, which is typically done via call to %configure
macro in %build
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand your point. This is not to set with %configure. It's the job of rebuilder like https://github.com/fepitre/rpmreproduce to ensure that same flags, env values would be the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @voxik is saying is that you will not be able to capture the variables because they only exist in the %build
step. We'd need a hook to export it from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Thank you for the enlightenment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@voxik @Conan-Kudo any advice on the best way to create this hook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it enough to catch them once, for example in %prep
? Should be the same, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not. Different variables are initialized at different stages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to get env variables rpmbuild
was started in (before it modified them)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/proc/self/environ
from Lua?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment in various stages of the build depends on various packages in the build environment, so I don't think we should try to capture it. It shoud be reproduced when the same packages are installed. But it seems that rpm
passes the environment passed in from the outside into the build environment, so capturing that seems useful. /proc/self/environ
sounds like a reasonable option.
OTOH, if the builds are done in mock
, the environment would be sanitized anyway. mock
now uses systemd-nspawn
, and it standarizes the environment to a large degree. Nevertheless, it does inject some information, like container_host_*
. This was added in systemd/systemd@e1bb4b0. I wonder if we should add a switch to systemd-nspawn to opt out of setting those variables and make mock use it. The environment inside of mock should not care at all about the host.
@bmwiedemann do you have some feedback to give from several comments here? As you work on reproducible builds for openSUSE you are certainly interested by this new feature. |
Interested in something like this for openSUSE. We already have something comparable called ArchLinux seems to have their own format with = as delimiter. OTOH, this is not a .spec file, but just another output from the build process, similar to the build log, so it does not need to follow rpm conventions either. So from my view, using either the Debian or ArchLinux (marshalling) format has some advantage over making up a third format. Though, I would not keep confusing Debianisms in keys that will not be meaningful outside of the respective distribution. |
Koji has a similar build environment record, though it's stored in the Koji database rather than as a file. We do archive environment artifacts from Mock with builds too, though. |
Thank you all for your feedback, I'll prepare another iteration soon with all the comments here. |
|
||
printf '%s' "\ | ||
Format: 1.0 | ||
BuildArchitecture: ${RPM_BUILD_ARCH} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want to capture the host architecture for the noarch
case, so we know if an architecture influenced the build?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's a good info to add indeed.
# Whitelist from Debian's Dpkg: | ||
# https://salsa.debian.org/dpkg-team/dpkg/-/blob/main/scripts/Dpkg/Build/Info.pm#L51 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this comment anymore now that it doesn't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this does not make much sense now as we are certainly going to adapt this list in the context of rpmbuild.
@fepitre I suspect you'll want to capture the shell variables used in each stage ( |
@Conan-Kudo I strongly recommends using the Debianisic names, as this simplifies tooling. Having different names, in addition to already requiring different packaging tools, makes tooling for reproducibility even harder. Yes, this makes names not match the RPM style. Anyhow, there is no use in creating yet another format, just for this reason. If the format is the same, tooling could share most of the code. I the format and names are diverging, this would require a lot of work adapting existing tooling. This will waste a lot of valuable and rare developer time. |
I am not willing to allow us to be boxed into Debian-isms. It will cause problems for us.
And you'd be wasting ours to map RPM semantics to Debian ones. Our tooling works in RPM conventions and semantics, which are different than Debian. |
Bump here, is there anything here that could use help? I see the conversation died down a bit, but there were a couple of outstanding points |
To be honest, I'm not sure how to converge for this in the case of RPM. As there is many build system like Koji, OBS, etc., they kind of hosting already what's needed as "buildinfo". I'm just wondering how much simpler it is to get these information from build platforms directly instead of trying to reach a consensus over distributions. |
Retrieving data from a build system may work as a temporary solution, but IMO we still need some file format (something redistributable) with build info. This will be necessary for example to compare build environments between build systems (like, between rebuilders), or just to have some way to have this information in authenticated (signed) form. |
RPM_ARCH="$7" | ||
|
||
RPM_BUILD_NAME="${RPM_PACKAGE_NAME}-${RPM_PACKAGE_VERSION}-${RPM_PACKAGE_RELEASE}.${RPM_ARCH}" | ||
RPM_BUILD_ARCH="$(uname -m)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get you the host arch, not the build arch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on definition. Looking at https://manpages.debian.org/testing/dpkg-dev/deb-buildinfo.5.en.html, this perhaps tries to be Build-Architecture counterpart which means distro architecture, but that could differ from uname -m
. It seems to me this was added from your request to track the build arch in case of noarch packages, and from that POV seems kinda reasonable to me, just that there may be a more rpm compatble source of information for that. But it's not really possible to discuss the correctness of a value without a definition of what that value is supposed to mean in the first place.
|
||
RPM_BUILD_NAME="${RPM_PACKAGE_NAME}-${RPM_PACKAGE_VERSION}-${RPM_PACKAGE_RELEASE}.${RPM_ARCH}" | ||
RPM_BUILD_ARCH="$(uname -m)" | ||
RPM_BUILD_OS="$(getos)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get you the host OS rather than the build OS.
} | ||
|
||
RPM_BUILD_ROOT="$1" | ||
RPM_BUILD_DIR="$2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BUILDROOT and BUILD_DIR may leak relatively sensitive information about usernames etc and are a slap in the face of reproducability for no good reason at all. If anything, inclusion of these should be an opt-in feature (eg enable for troubleshooting purposes only)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this is necessary because it can affect the build significantly. Those paths regularly get embedded in source files, debuginfo, and documentation metadata. That's why I had to change redhat-rpm-config last year to stop Koji builds with doxygen doc builds from failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this is necessary because it can affect the build significantly. Those paths regularly get embedded in source files, debuginfo, and documentation metadata. That's why I had to change redhat-rpm-config last year to stop Koji builds with doxygen doc builds from failing.
That embedding is a bug in the package in question and should be reported as such. BUILDROOT and BUILD_DIR should be used only when bugs require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rpm checks against embedded BUILDROOT in the packages already and fails the build when found, we should probably extend that to cover BUILD DIR because references to that are equally a bug, always. Had to look up the redhat-rpm-config change - Doxygen needs a whack in the head, really. I also believe it has options to disable that, if not then we should file a bug on it.
Yeah the "security" aspect is lame, and source + debuginfo packages having them is one thing, it's just that for a moment I thought we're talking about the main binary packages ... but we're not, this is supposed to go into a subpackage. Which I'm not sure makes a whole lot of sense in the first place: you have to fish the plaintext file from inside an rpm. It'd be different if the relevant data could be queried in an rpm header itself, but we don't want to add two dozen new tags specific for just this either. Is there a rationale for this particular design somewhere? (it should be in the commit messages of this thing!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this is necessary because it can affect the build significantly. Those paths regularly get embedded in source files, debuginfo, and documentation metadata. That's why I had to change redhat-rpm-config last year to stop Koji builds with doxygen doc builds from failing.
That embedding is a bug in the package in question and should be reported as such. BUILDROOT and BUILD_DIR should be used only when bugs require it.
Sorry, I don't care whether it's a bug or not. It's reality. Since that's a required component of build reproducibility, it needs to be recorded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I suggested putting it into "an src.rpm like package", which would be a separate (sub-)package, but not necessarily a regular binary package which is expected to be installed, nor the main src.rpm (which would be the right place from rpm POV in many ways but not so much for the practical world). The main package is indeed the wrong place for it for various reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to stick it somewhere, the debugsource package might be as good of a place as any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to stick it somewhere, the debugsource package might be as good of a place as any?
Do all packages have debugsource? AFAIK noarch usually do not, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's true...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate buildinfo RPM seems like the best option to me.
This doesn't seem to go anywhere. So I am closing this PR here. Feel free to re-open or may be open a new one. The discussion is becoming a bit unwieldy. |
|
||
getos() { | ||
# shellcheck disable=SC1091 | ||
test -r /etc/os-release && . /etc/os-release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os-release is also /usr/lib/os-release. Please use this instead:
test -e /etc/os-release && os_release='/etc/os-release' || os_release='/usr/lib/os-release'
. "${os_release}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your conditional assumes the file exists at all. It needs to handle when it's not present at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. So something like this:
if test -e /etc/os-release; then
. /etc/os-release
elif test -e /usr/lib/os-release; then
. /usr/lib/os-release
fi
Differences from the original code:
- do not suppress permission errors by using
-r
instead of-e
- use /usr/lib/os-release if appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me.
|
||
{ \ | ||
printf 'EnvironmentRequires:\n'; \ | ||
rpm -qa --queryformat '%{epoch}:%{name}-%{version}-%{release}.%{arch}\n' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why this ordering. I would expect:
'%{name}-%{epoch}:%{version}-%{release}.%{arch}\n'
(It seems --queryformat
doesn't support more complex expressions, so the sed for (none):
still seems needed ;( )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That old ordering is probably influenced by the old Debian tooling using YUM instead of DNF, and YUM was weird about this. We should absolutely use the normal format that RPM and DNF use instead.
TODO:
Reference: https://wiki.debian.org/ReproducibleBuilds/BuildinfoFiles, Thread "Reproducible builds" on Fedora devel list.