F43 Change Proposal: Package builds are expected to be reproducible (system-wide)

Package builds are expected to be reproducible

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki
Announced

:link: Summary

Over the last few releases, we changed our build infrastructure to make package builds reproducible. This is enough to reach 90%. The remaining issues need to be fixed in individual packages. After this Change, package builds are expected to be reproducible. Bugs will be filed against packages when an irreproducibility is detected. The goal is to have no fewer than 99% of package builds reproducible.

A public service with package rebuild statistics and reports for individual packages is made available. (An instance of rebuilderd.) A script to make local rebuilds of historic koji builds is made available (fedora-repro-build).

:link: Owner

:link: Detailed Description

We define “reproducible” as when a completely independent rebuild of a koji package produces rpms that are identical except for the build timestamps, signatures, and some associated metadata. The payload (i.e. packaged files) and important metadata are bit-for-bit identical.

In 2023, Changes/ReproducibleBuildsClampMtimes added clamping of mtimes to $SOURCE_DATE_EPOCH and changed the process to produce .pyc files to take $SOURCE_DATE_EPOCH into account. In 2024, Changes/ReproduciblePackageBuilds introduced add-determinism into package builds. Along the way, various details in rpm itself and other tools were adjusted to increase reproducibility. With those changes, about 90% of package builds are reproducible. The initial “big” issues that affect all packages are now solved, and what remains are problems that require changes in individual packages.

The next step is to ask maintainers to resolve reproducibility issues in their packages. The goal is to have 99% of packages reproducible in Fedora. To achieve this, bugzillas will be opened against packages when a rebuild reports differences.

We are aware of some issues that cannot be fixed easily:

  • Haskell packages are not reproducible when compiled with more than one thread. Upstream is working on the issue, the next release of ghc may resolve the issue completely.
  • mingw packages have irreproducible debug data.
  • golang packages have irreproducible debug data (irreproducibility#15).
  • the kernel uses an ephemeral key for module signatures. See [1],[2] for a possible solution.
  • packages that are signed for SecureBoot use a private key (shim, grub2).
  • Some BuildRequires on srpms are architecture-dependent. This is mostly an artifact of how we prepare the environment for builds. It does not directly affect binary rpms.

We will create tracker bugs for the issues that affect multiple packages (haskell, mingw, golang). We hope that those issues will be resolved either upstream or downstream.

:link: Feedback

:link: Benefit to Fedora

(The first three paragraphs are copied unchanged from Changes/ReproduciblePackageBuilds.)

Adding determinism (i.e., removing non-determinsim) enables the Fedora community to have confidence that, if given the same source code, build environment, build instructions, and metadata from the build artifacts, any party can recreate copies of the artifacts that are identical except for the signatures and some parts of metadata.

Reproducibility of builds leads to packages of higher quality. It turns out that quite often those irreproducible bits are caused by an error or sloppiness in the code. In particular, any dependence on architecture in noarch packages is almost always unwanted and/or a bug. Test builds that check reproducibility will expose such instances.

Reproducibility of builds makes it easier to develop packages: when a small change is made and a package is rebuilt (in the same environment), then with a reproducible package, the only difference is directly caused by the change. If the package is different every time it is rebuilt, making a comparison is much harder.

Build reproducibility is a topic that is gaining in popularity. Major distributions like Debian, Arch, OpenSUSE, and NixOS are trying to achieve full reproducibility. By making b-r an expectation in Fedora, we avoid driving away people who consider b-r a requirement. We may even attract additional contributors who are interested in this topic, if we achieve better results than other distros. With 90% reproducibility we’re on par, with 99% we can be the leader :grinning_face:.

:link: Scope

  • Proposal owners:

    • Package fedora-repro-build to allow local rebuilds of historical koji builds
    • Make rebuilderd work with Fedora packages and repos
    • Stand up a public rebuilderd instance for Fedora rawhide
    • Adjust add-determinism to handle new cases, if widespread issues are found and it’s possible to handle them in the cleanup phase
    • Open bugs against packages when irreproducibilities are detected, initially as a manual process
    • Open a pull request for Packaging Guidelines
  • Other developers:

    • Fix irreproducibility issues in individual packages
    • Fix tooling issues that affect multiple packages, if possible
    • Review the pull request for Packaging Guidelines
    • Adjust various packager workflow descriptions in the wiki
  • Release engineering: #Releng issue number

  • Policies and guidelines:

  • Trademark approval: N/A (not needed for this Change)

  • Alignment with the Fedora Strategy:

:link: Upgrade/compatibility impact

N/A

:link: Early Testing (Optional)

Do you require ‘QA Blueprint’ support? Y/N

We have the tooling in place to do a local rebuild of a koji build and compare the results. Packagers can do such rebuilds and verify the result or fix the issues if any are found.

:link: How To Test

When the public rebuilderd instance is up, maintainers will be able to see reports and diffoscope output for their packages. They can then use this to learn about any issues in their packages.

:link: User Experience

No user-visible changes. Users may do local rebuilds and expect that the results are reproducible.

:link: Dependencies

:link: Contingency Plan

  • Contingency mechanism: All the items in the Scope are additions to the current tooling and infrastructure, so there is no need for a contingency plan for them. If we back out of the whole idea, rather than just postponing it for example, Packaging Guidelines changes would probably need to be reverted and any bugzillas closed.
  • Contingency deadline: any time
  • Blocks release? no

:link: Documentation

:link: Release Notes

Reproducibility of package builds has improved. TBD% of package builds are now reproducible.

Last edited by @zbyszek 2025-03-19T22:27:26Z

Last edited by @zbyszek 2025-03-19T22:27:26Z

3 Likes

How do you feel about the proposal as written?

  • Strongly in favor
  • In favor, with reservations
  • Neutral
  • Opposed, but could be convinced
  • Strongly opposed
0 voters

If you are in favor but have reservations, or are opposed but something could change your mind, please explain in a reply.

We want everyone to be heard, but many posts repeating the same thing actually makes that harder. If you have something new to say, please say it. If, instead, you find someone has already covered what you’d like to express, please simply give that post a :heart: instead of reiterating. You can even do this by email, by replying with the heart emoji or just “+1”. This will make long topics easier to follow.

Please note that this is an advisory “straw poll” meant to gauge sentiment. It isn’t a vote or a scientific survey. See About the Change Proposals category for more about the Change Process and moderation policy.

Will there be some kind of automated testing to ensure that the reproducibility of specific RPMS does not regress?

Yes. Part of the Scope is to make a public rebuilderd instance available. It essentially does an ongoing test rebuild of packages and gathers results.

Spello in title

1 Like

Would it be possible to integrate this into the gating tests for bodhi updates?

Would it be possible to integrate this into the gating tests for bodhi updates?

Theoretically — yes. In practice — no such plans at this point. Right now, we expect some percentage of the rebuilds to fail, so it’d be too early to gate on this. After some time, once those bugs have been fixed, theoretically we try could try. But this always means a second build, which can be quite slow. (That said, the rebuilds can be done without tests, which are often the slowest part.)

The problem is that if we just rebuild the package a second time in the same build environment, we are not testing much. We really want to build it on a different variant of the same architecture, with a different file system, on a different date, etc. But I don’t think we want to set up two variants of our build environment internally.

Dunno, this could be something to consider in the future, depending on how this initial stage goes.

Where will this instance live and who will maintain it? :slight_smile:

It might be nice to have a doc explaining how to setup one of these, so anyone interested could do so.

What is the input to rebuilderd? Is it src.rpms? Or git repos/scm+lookaside?
Would it operate on rawhide composes or on rawhide package builds?

Otherwise I like the idea!

Currently we’re using a Meta-sponsored AWS account for a lot of this work. I’m happy to continue using that, or we can look at moving into Fedora infra if there’s a preference for that. The good thing about rebuilderd is that there can be multiple indipendently-managed deployments (and having multiple is actually useful, as you can confirm they all produce the same result).

Well, I think it might be good to run this outside fedora infra, just to
make it more independent/etc. Although of course we could run one and
then others could run others and compare as you note.