F43 Change Proposal: Package builds are expected to be reproducible (system-wide)

zbyszek · March 19, 2025, 6:43pm

Package builds are expected to be reproducible

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki
Announced

Summary

Over the last few releases, we changed our build infrastructure to make package builds reproducible. This is enough to reach 90%. The remaining issues need to be fixed in individual packages. After this Change, package builds are expected to be reproducible. Bugs will be filed against packages when an irreproducibility is detected. The goal is to have no fewer than 99% of package builds reproducible.

A public service with package rebuild statistics and reports for individual packages is made available. (An instance of rebuilderd.) A script to make local rebuilds of historic koji builds is made available (fedora-repro-build).

Owner

- Name: Zbigniew Jędrzejewski-Szmek
- Email: zbyszek@in.waw.pl
- Name: Davide Cavalca
- Email: dcavalca@fedoraproject.org
- Name: Jelle van der Waa
- Email: jvanderw@redhat.com

Detailed Description

We define “reproducible” as when a completely independent rebuild of a koji package produces rpms that are identical except for the build timestamps, signatures, and some associated metadata. The payload (i.e. packaged files) and important metadata are bit-for-bit identical.

In 2023, Changes/ReproducibleBuildsClampMtimes added clamping of mtimes to $SOURCE_DATE_EPOCH and changed the process to produce .pyc files to take $SOURCE_DATE_EPOCH into account. In 2024, Changes/ReproduciblePackageBuilds introduced add-determinism into package builds. Along the way, various details in rpm itself and other tools were adjusted to increase reproducibility. With those changes, about 90% of package builds are reproducible. The initial “big” issues that affect all packages are now solved, and what remains are problems that require changes in individual packages.

The next step is to ask maintainers to resolve reproducibility issues in their packages. The goal is to have 99% of packages reproducible in Fedora. To achieve this, bugzillas will be opened against packages when a rebuild reports differences.

We are aware of some issues that cannot be fixed easily:

Haskell packages are not reproducible when compiled with more than one thread. Upstream is working on the issue, the next release of ghc may resolve the issue completely.
mingw packages have irreproducible debug data.
golang packages have irreproducible debug data (irreproducibility#15).
the kernel uses an ephemeral key for module signatures. See [1],[2] for a possible solution.
packages that are signed for SecureBoot use a private key (shim, grub2).
Some BuildRequires on srpms are architecture-dependent. This is mostly an artifact of how we prepare the environment for builds. It does not directly affect binary rpms.

We will create tracker bugs for the issues that affect multiple packages (haskell, mingw, golang). We hope that those issues will be resolved either upstream or downstream.

Feedback

Benefit to Fedora

(The first three paragraphs are copied unchanged from Changes/ReproduciblePackageBuilds.)

Adding determinism (i.e., removing non-determinsim) enables the Fedora community to have confidence that, if given the same source code, build environment, build instructions, and metadata from the build artifacts, any party can recreate copies of the artifacts that are identical except for the signatures and some parts of metadata.

Reproducibility of builds leads to packages of higher quality. It turns out that quite often those irreproducible bits are caused by an error or sloppiness in the code. In particular, any dependence on architecture in noarch packages is almost always unwanted and/or a bug. Test builds that check reproducibility will expose such instances.

Reproducibility of builds makes it easier to develop packages: when a small change is made and a package is rebuilt (in the same environment), then with a reproducible package, the only difference is directly caused by the change. If the package is different every time it is rebuilt, making a comparison is much harder.

Build reproducibility is a topic that is gaining in popularity. Major distributions like Debian, Arch, OpenSUSE, and NixOS are trying to achieve full reproducibility. By making b-r an expectation in Fedora, we avoid driving away people who consider b-r a requirement. We may even attract additional contributors who are interested in this topic, if we achieve better results than other distros. With 90% reproducibility we’re on par, with 99% we can be the leader .

Scope

Proposal owners:
- Package fedora-repro-build to allow local rebuilds of historical koji builds
- Make rebuilderd work with Fedora packages and repos
- Stand up a public rebuilderd instance for Fedora rawhide
- Adjust add-determinism to handle new cases, if widespread issues are found and it’s possible to handle them in the cleanup phase
- Open bugs against packages when irreproducibilities are detected, initially as a manual process
- Open a pull request for Packaging Guidelines
Other developers:
- Fix irreproducibility issues in individual packages
- Fix tooling issues that affect multiple packages, if possible
- Review the pull request for Packaging Guidelines
- Adjust various packager workflow descriptions in the wiki
Release engineering: #Releng issue number
Policies and guidelines:
- Packaging Guidelines will be changed to say that packages should build reproducibly, and link to our docs (reproducible builds) and upstream docs at reproducible-builds.org.
Trademark approval: N/A (not needed for this Change)
Alignment with the Fedora Strategy:

Upgrade/compatibility impact

N/A

Early Testing (Optional)

Do you require ‘QA Blueprint’ support? Y/N

We have the tooling in place to do a local rebuild of a koji build and compare the results. Packagers can do such rebuilds and verify the result or fix the issues if any are found.

How To Test

When the public rebuilderd instance is up, maintainers will be able to see reports and diffoscope output for their packages. They can then use this to learn about any issues in their packages.

User Experience

No user-visible changes. Users may do local rebuilds and expect that the results are reproducible.

Dependencies

Contingency Plan

Contingency mechanism: All the items in the Scope are additions to the current tooling and infrastructure, so there is no need for a contingency plan for them. If we back out of the whole idea, rather than just postponing it for example, Packaging Guidelines changes would probably need to be reverted and any bugzillas closed.
Contingency deadline: any time
Blocks release? no

Documentation

Release Notes

Reproducibility of package builds has improved. TBD% of package builds are now reproducible.

Last edited by @zbyszek 2025-03-19T22:27:26Z

Last edited by @zbyszek 2025-03-19T22:27:26Z

system · March 19, 2025, 6:43pm

How do you feel about the proposal as written?

Strongly in favor
In favor, with reservations
Neutral
Opposed, but could be convinced
Strongly opposed

0 voters

If you are in favor but have reservations, or are opposed but something could change your mind, please explain in a reply.

We want everyone to be heard, but many posts repeating the same thing actually makes that harder. If you have something new to say, please say it. If, instead, you find someone has already covered what you’d like to express, please simply give that post a instead of reiterating. You can even do this by email, by replying with the heart emoji or just “+1”. This will make long topics easier to follow.

Please note that this is an advisory “straw poll” meant to gauge sentiment. It isn’t a vote or a scientific survey. See About the Change Proposals category for more about the Change Process and moderation policy.

tstellar · March 19, 2025, 6:57pm

Will there be some kind of automated testing to ensure that the reproducibility of specific RPMS does not regress?

zbyszek · March 19, 2025, 10:24pm

Yes. Part of the Scope is to make a public rebuilderd instance available. It essentially does an ongoing test rebuild of packages and gathers results.

boredsquirrel · March 19, 2025, 10:26pm

Spello in title

tstellar · March 19, 2025, 10:49pm

Would it be possible to integrate this into the gating tests for bodhi updates?

zbyszek · March 20, 2025, 8:11am

Would it be possible to integrate this into the gating tests for bodhi updates?

Theoretically — yes. In practice — no such plans at this point. Right now, we expect some percentage of the rebuilds to fail, so it’d be too early to gate on this. After some time, once those bugs have been fixed, theoretically we try could try. But this always means a second build, which can be quite slow. (That said, the rebuilds can be done without tests, which are often the slowest part.)

The problem is that if we just rebuild the package a second time in the same build environment, we are not testing much. We really want to build it on a different variant of the same architecture, with a different file system, on a different date, etc. But I don’t think we want to set up two variants of our build environment internally.

Dunno, this could be something to consider in the future, depending on how this initial stage goes.

kevin · March 20, 2025, 6:46pm

Where will this instance live and who will maintain it?

It might be nice to have a doc explaining how to setup one of these, so anyone interested could do so.

What is the input to rebuilderd? Is it src.rpms? Or git repos/scm+lookaside?
Would it operate on rawhide composes or on rawhide package builds?

Otherwise I like the idea!

dcavalca · March 20, 2025, 10:38pm

Currently we’re using a Meta-sponsored AWS account for a lot of this work. I’m happy to continue using that, or we can look at moving into Fedora infra if there’s a preference for that. The good thing about rebuilderd is that there can be multiple indipendently-managed deployments (and having multiple is actually useful, as you can confirm they all produce the same result).

kevin · March 22, 2025, 4:39pm

Well, I think it might be good to run this outside fedora infra, just to
make it more independent/etc. Although of course we could run one and
then others could run others and compare as you note.

amoloney · April 1, 2025, 1:16pm

This change proposal has now been submitted to FESCo with ticket #3386 for voting.

To find out more, please visit our Changes Policy documentation.

amoloney · April 22, 2025, 5:49pm

This change has been accepted by FESCo for Fedora Linux 43. A full list of approved changes to date can be found on the Change Set Page.

To find out more about how our changes policy works, please visit our docs site.

Topic		Replies	Views
F41 Change Proposal - Reproducible Package Builds (System-Wide) Change Proposals fesco	13	967	May 8, 2024
Report from the Reproducible Builds Hackfest during Flock 2023 Project Discussion release-engineering-team , flock	8	1999	April 11, 2025
Seeking Feedback: Modernising Fedora Package Submission Process (GSoC Project) Project Discussion council , package-maintainers	13	300	July 14, 2025
F43 Change Proposal: java25 and no more system JDK (system-wide) Change Proposals fesco , f43	36	892	August 14, 2025
F43 Change Proposal RPM 6.0 (system-wide) Change Proposals fesco , f43	9	491	March 19, 2025