Gating Fedora updates on Fedora CoreOS CI

Hi all,

Last year, the Fedora CoreOS working group implemented CI testing [1] for Bodhi updates on a set of critical packages [2]. Automatic updates are a key feature of Fedora CoreOS, and this testing helps us detect update related issues early, improving Fedora’s update stability and reducing troubleshooting time.

While our long-term goal is to implement this CI testing in fedora-bootc with Bodhi gating integration, there’s still significant work ahead before we can trigger fedora-bootc tests on Bodhi updates. It’s worth noting that many of the tests currently running in Fedora CoreOS CI are essentially “image mode” tests rather than CoreOS-specific tests. Eventually, we expect to migrate these tests to fedora-bootc. However, until that infrastructure is ready, enabling gating on the FCOS suite provides immediate image mode coverage for critical packages.

Given our experience running these tests, we would like to propose making the coreos.cosa.build-and-test a required gate for package updates in rawhide. We’ve already been successfully gating packages owned by the Fedora CoreOS working group [3], and we’d like to extend this requirement to the broader package set defined here [4].

Following is the breakdown of passed vs failed builds by package on over 400 builds, this gives package maintainers an idea of how often an update might be gated. It is important to note that not all test failures here are related to the software in the proposed Bodhi update since there could be flakes; either due to the test infra environment or due to some transient test pipeline misconfiguration. In the case where failures are not related to updates , it would be easy to waive the test or coordinate with the Fedora CoreOS working group to disable the test.

Packages Green Builds Red Builds TOTAL Green Builds % Red Builds %
kernel 74 15 89 83.15% 16.85%
selinux-policy 26 10 36 72.22% 27.78%
systemd 11 8 19 57.89% 42.11%
podman 17 7 24 70.83% 29.17%
glibc 37 6 43 86.05% 13.95%
ostree 9 4 13 69.23% 30.77%
rpm-ostree 14 3 17 82.35% 17.65%
rust-zincati 2 3 5 40.00% 60.00%
rust-bootupd 6 3 9 66.67% 33.33%
NetworkManager 9 3 12 75.00% 25.00%
makedumpfile 1 3 4 25.00% 75.00%
dracut 3 2 5 60.00% 40.00%
openssh 7 2 9 77.78% 22.22%
coreutils 5 2 7 71.43% 28.57%
buildah 13 1 14 92.86% 7.14%
util-linux 4 1 5 80.00% 20.00%
nmstate 3 1 4 75.00% 25.00%
glib2 10 1 11 90.91% 9.09%
nbdkit 0 1 1 0.00% 100.00%
container-selinux 7 1 8 87.50% 12.50%
toolbox 19 1 20 95.00% 5.00%
emacs 1 1 2 50.00% 50.00%
grub2 10 0 10 100.00% 0.00%
ignition 8 0 8 100.00% 0.00%
moby-engine 9 0 9 100.00% 0.00%
rust-coreos-installer 3 0 3 100.00% 0.00%
kdump-utils 8 0 8 100.00% 0.00%
checkpolicy 1 0 1 100.00% 0.00%
kexec-tools 2 0 2 100.00% 0.00%
NetworkManager-sstp 1 0 1 100.00% 0.00%
skopeo 1 0 1 100.00% 0.00%
containers-common 10 0 10 100.00% 0.00%
rust-afterburn 3 0 3 100.00% 0.00%
TOTAL 334 79 413.00 80.87% 19.13%

References:

[1] Jenkins

[2] coreos-ci/bodhi-testing.yaml at main · coreos/coreos-ci · GitHub

[3] coreos-ci/bodhi-testing.yaml at main · coreos/coreos-ci · GitHub

[4] coreos-ci/bodhi-testing.yaml at main · coreos/coreos-ci · GitHub

1 Like

Can you clarify, is this proposal just for the packages in the list you gave, or is it for all packages?

If it’s all then can you explain what we need to be doing to ensure that 20% or more of our future builds don’t wind up getting rejected? Assume that we have never used CoreOS and know nothing about it…

This would be only for the packages listed there.

A few things about the 20% number:

  • The tests for kernel + selinux-policy + systemd + podman (50% of the failures) have caught regressions in the past. Those regressions landed in Fedora as the tests were not gating. Those were regressions not just for Fedora CoreOS users but for all Fedora users.
  • The tests for ostree, rpm-ostree, rust-zincati, rust-bootupd (16% of the failures) are for packages generally managed by the CoreOS team.
1 Like