We need to come up with a consistent approach for generating and publishing containers: both 'traditional' and atomic desktop containers, both stable and unstable releases

adamwill · March 21, 2024, 11:06pm

Describe the issue
We have grown a bit of a mess around how we build and publish container images. We need to straighten it out.

Here’s the 10,000 foot view, as I understand it:

For unstable releases, we build both ‘traditional’ (generic, generic minimal, toolbox) and atomic desktop OCI containers in the nightly compose. We also build atomic desktop ostrees. When the compose completes, we run sync-latest-container-base-image.sh - which publishes the ‘traditional’ containers to registries - and sync-ostree-base-containers.sh - which converts the silverblue, kinoite and sericea ostrees to containers and publishes those to registries. We don’t actually publish the native atomic desktop OCI containers anywhere.
For stable releases, we have a Fedora-Container compose that builds ‘traditional’ containers and should publish them (only because of the thing @kevin is fixing in PR#1267: f39: fix container-nightly.sh script to sync the right thing - pungi-fedora - Pagure.io , it doesn’t). That compose does not build atomic desktop containers. Instead, Bodhi creates atomic desktop ostrees daily, which is how people get updates. But it does not produce native OCI containers, or run sync-ostree-base-containers.sh to convert the ostrees it creates into containers and publish those.

There are several problems here:

We shouldn’t have two janky bash scripts for publishing containers to registries. We should have one tool in a sensible language (Python!) which can be properly tested. Also, it should use compose metadata to find the images (not weirdly hardcoded Koji searches, like the current sync-latest-container-base-image.sh does) - although this is a bit complicated if we’re building things in Bodhi, which doesn’t produce productmd metadata (AFAIK).
Stable release ostree builds being off in Bodhi while everything else is in composes is a bit awkward, especially since we are trying to move away from ostrees towards native OCI containers for atomic desktops. Do we want to move more container builds into Bodhi, or move the stable release nightly ones out of Bodhi? Do we need to teach Bodhi to build OCI containers? Publish to registries?
It would be good to have the ability to gate registry pushes. We can test all these images to some extent; it would be good to set things up such that we can gate publishing to the registry tags used to update user systems on test results.

adamwill · March 21, 2024, 11:07pm

CC @kevin @siosm @walters @ngompa @davdunc

ngompa · March 22, 2024, 12:18pm

We shouldn’t be doing any builds in Bodhi. These should be in the compose process rather than there. Gating registry pushes should probably be there though…

adamwill · March 22, 2024, 3:15pm

@kevin what was the thinking behind putting the atomic desktop nightly ostree builds in bodhi? just curious if there’s an advantage to it we hadn’t thought of.

Thinking about doing gating via Bodhi…mmm. I dunno. I mean, we always had the idea that greenwave was meant to be a neutral service consumed by Things That Want To Do Gating, not just a feeder for Bodhi. In a way, if I’m just thinking about “how to sync container images to registries”, Bodhi doesn’t feel like an obvious part of the process. My natural thought I guess would be just to write a message consumer that can sync container images from composes, and set it up so it can just fire when the compose is complete, or fire in response to CI messages, and have it do the thing where it checks the gating status after every test then syncs if it’s ‘passed’.

Doing it via Bodhi I guess gives us some of that logic already ‘baked in’, though we’d have to translate and extend a few things, I think. But I think we’d be back at having to find the Koji builds for each container build (since Bodhi wants you to submit Koji builds as the update components), which is not something that’s in any metadata, so we’re back to the kind of dumb logic in the current shell script (only worse because we have to somehow make sure we find the task that matches the compose we’re trying to publish, I guess). meh.

kevin · March 22, 2024, 6:01pm

Describe the issue
We have grown a bit of a mess around how we build and publish container images. We need to straighten it out.

Here’s the 10,000 foot view, as I understand it:

For unstable releases, we build both ‘traditional’ (generic, generic minimal, toolbox) and atomic desktop OCI containers in the nightly compose. We also build atomic desktop ostrees. When the compose completes, we run sync-latest-container-base-image.sh - which publishes the ‘traditional’ containers to registries - and sync-ostree-base-containers.sh - which converts the silverblue, kinoite and sericea ostrees to containers and publishes those to registries. We don’t actually publish the native atomic desktop OCI containers anywhere.

Yeah, that seems correct.

For stable releases, we have a Fedora-Container compose that builds ‘traditional’ containers and should publish them (only because of the thing @kevin is fixing in PR#1267: f39: fix container-nightly.sh script to sync the right thing - pungi-fedora - Pagure.io , it doesn’t). That compose does not build atomic desktop containers. Instead, Bodhi creates atomic desktop ostrees daily, which is how people get updates. But it does not produce native OCI containers, or run sync-ostree-base-containers.sh to convert the ostrees it creates into containers and publish those.

Correct.

There are several problems here:

We shouldn’t have two janky bash scripts for publishing containers to registries. We should have one tool in a sensible language (Python!) which can be properly tested. Also, it should use compose metadata to find the images (not weirdly hardcoded Koji searches, like the current sync-latest-container-base-image.sh does) - although this is a bit complicated if we’re building things in Bodhi, which doesn’t produce productmd metadata (AFAIK).

Agreed.

Stable release ostree builds being off in Bodhi while everything else is in composes is a bit awkward, especially since we are trying to move away from ostrees towards native OCI containers for atomic desktops. Do we want to move more container builds into Bodhi, or move the stable release nightly ones out of Bodhi? Do we need to teach Bodhi to build OCI containers? Publish to registries?

Well, the reason it’s there is that bodhi is already calling pungi to
compose the rpm updates/updates-testing repos. Those are the very things
we need to make to then update the ostrees. So, moving this out of bodhi
seems like it adds complexity… it means we have to have something wait
for updates/updates-testing composes in bodhi to fully sync out, then
after whatever delay, call pungi and generate the thing that bodhi could
have in the same flow.

Additionally, if bodhi is doing both rpms and ostrees if there’s a
problem in one or the other, the compose fails and we can fix it and
resume or whatever. If they are disconnected processes, you might get
say rpms updating and ostree not, or vice versa.

So you would need something to coordinate… it just seems cleaner to
just do it in bodhi composes.

It would be good to have the ability to gate registry pushes. We can test all these images to some extent; it would be good to set things up such that we can gate publishing to the registry tags used to update user systems on test results.

Agreed. Bodhi does have a ‘container’ flow from back when we were making
a number of containers. That flow is basically to push from koji →
candidate-registry for updates-testing, then copy from
candidate-registry to registry for stable updates.

So, perhaps we could just extend bodhi to detect when containers are
build, and for unstable updates auto create a update, which then can get
testing, etc, etc, For stable releases we would have to either get it
also to do that or have some kind of process to manually submit them.

Additional items around this:

We want to ‘move’ to quay.io. My plan was to make sure everything was
at quay.io and then set registry.fedoraproject.org to just be a
redirect. This allows us to control things in case we ever need to
repoint it or move it back. I guess we also need a
quay.io/fedora-candidate to replace out candidate registry.
But when we do this we should really add some error checking in…if we
can’t push to quay.io, it should error out.
Whats our desired end state here format wise? I guess we want to move
to oci containers everywhere?

kevin · March 22, 2024, 6:04pm

@kevin what was the thinking behind putting the atomic desktop nightly ostree builds in bodhi? just curious if there’s an advantage to it we hadn’t thought of.

Well, its a compose no?
We need to compose new ostree commits from updates/updates-testing rpms
which we are right there calling pungi to create. Also we need to
coordinate them so if one fails the other does too.

So, I wouldn’t think of it as bodhi building anything, it’s just
composing.

Thinking about doing gating via Bodhi…mmm. I dunno. I mean, we always had the idea that greenwave was meant to be a neutral service consumed by Things That Want To Do Gating, not just a feeder for Bodhi. In a way, if I’m just thinking about “how to sync container images to registries”, Bodhi doesn’t feel like an obvious part of the process. My natural thought I guess would be just to write a message consumer that can sync container images from composes, and set it up so it can just fire when the compose is complete, or fire in response to CI messages, and have it do the thing where it checks the gating status after every test then syncs if it’s ‘passed’.

Yeah, we could but without bodhi there’s not really a lot of visibility
there. Also with bodhi we could even get users +1/-1ing…

Doing it via Bodhi I guess gives us some of that logic already ‘baked in’, though we’d have to translate and extend a few things, I think. But I think we’d be back at having to find the Koji builds for each container build (since Bodhi wants you to submit Koji builds as the update components), which is not something that’s in any metadata, so we’re back to the kind of dumb logic in the current shell script (only worse because we have to somehow make sure we find the task that matches the compose we’re trying to publish, I guess). meh.

yeah, there must be a better way.

I mean, bodhi can handle containers now, perhaps we should look at whats
there already…?

gtb · March 22, 2024, 6:21pm

This is likely out of scope (so feel free to ignore it), but it would be nice if there were a couple of additional tags for some of the existing containers such as “latest-1”? and “branched”?, that would (today) represent F38 and F40 for CI uses without having to explicitly name the versions. Right now I have to go in and manually update my workflows with the new numbers every six months (or so) for testing on supported (or soon to be beta/production) fedoras. Thanks for any consideration.

adamwill · March 22, 2024, 7:09pm

Those are the very things
we need to make to then update the ostrees. So, moving this out of bodhi
seems like it adds complexity… it means we have to have something wait
for updates/updates-testing composes in bodhi to fully sync out, then
after whatever delay, call pungi and generate the thing that bodhi could
have in the same flow.

Okay…but then, why don’t we do the same for the other stable composes that run nightly? We have nightly Cloud and Container composes that are just run out of scripts like the branched and rawhide nightlies. I just can’t quite figure out the overall organizing principle here If Bodhi can run composes, should we have Bodhi run…all the composes?

kevin · March 22, 2024, 7:48pm

In the past there was a ‘container sig’ that was going to produce
containers of a bunch of applications. Due to not so many people wanting
to do that and our container build system being a horror, all those
people wandered off, although there’s a faction that produces container
images on quay.io. (They build them there even).

So, when that was supposed to be a thing, you couldn’t just build
containers at updates/updates-testing time normally, you wanted to let
maintainers build them whenever and make updates in bodhi, etc.

We built a daily update to the base container, but didn’t push it
anywhere without a human saying ‘ok, this one is good, I tested it, lets
push it out’. (which is a lot of waste to make them everyday and only
use one every month or so, but anyhow…)

At least thats what I can recall.

I think it might make sense to just have bodhi compose them all.
How would that work though for updates/updates-testing and gating?
And what would happen if say the toolbox container with updates composed
fine, but failed some gating tests? and right now we just have the one
‘stream’, no updates-testing versions anyhow. If we added those, thats
more artifacts.

Also, if we do the tests in the bodhi updates compose path, that makes
them a some longer probibly?

Also, another wrinkle… the fedora base/minimal/toolbox containers
probibly don’t change a lot of the time… but yet we recompose them.
I’m not sure on the atomic desktops side if we recompose them and the
rpms didn’t change, does the ostree commits we push change?

Ideally all of these would just happen if something changes, but thats a
pretty hard problem.

siosm · March 25, 2024, 4:40pm

We want to only have OCI containers in the end, but it’s likely going to take a few releases until we get there. So this is how I think we should proceed:

Remove the current ostree to container conversion script and replace it by another (unified) script that pushes the new ostree native container images to quay.
Add Atomic composes to Bodhi as well and sync those to quay.
Then we need to start the transition from ostree remotes to containers on end-user systems.
Let at least 2 releases go by, then remove the ostree composes.

siosm · March 25, 2024, 4:42pm

Yes, I agree this would be nice. I’ll look at it after we’ve setup the initial sync.

gtb · March 25, 2024, 5:06pm

Thanks for the future consideration (it was all I was hoping for).

btw, thinking about it a bit more, rather than “branched”, perhaps the name should be “next”, (which would point to rawhide until branched, and than the branched variant until final release, when it is changed to point back to rawhide when latest gets updated; I don’t know how hard that would be to accomplish)? However, I don’t really care about the name(s) (although I am sure someone does), I would just like the name to be mostly stable and usable across my workflows.

Thanks.

kevin · March 25, 2024, 10:49pm

We want to only have OCI containers in the end, but it’s likely going to take a few releases until we get there. So this is how I think we should proceed:

ok

Remove the current ostree to container conversion script and replace it by another (unified) script that pushes the new ostree native container images to quay.

Add Atomic composes to Bodhi as well and sync those to quay.

Sounds ok, so in that plan though we don’t do any ci/gating/testing on
them? just assume that the packages should be ok since they are going
out…

I suppose we could sync all of them to a candidate area and then have
testing/ci on them and it promotes them to release? But then we get into
the updates aren’t sync with ostrees, so thats not a great thing.

Then we need to start the transition from ostree remotes to containers on end-user systems.

Let at least 2 releases go by, then remove the ostree composes.

Sure, to quote my fav superhero: Don’t be hasty.

siosm · March 26, 2024, 10:04am

I think we should do CI gating after the initial setup is in place.

We can start by pushing all composes under their own tag (i.e. 40.20240326.0) and “auto” update the latest/fXY tags automatically.

Then we can add CI gating to the latest/fXY tag updates to only push them if the compose pass CI/openQA.

siosm · March 26, 2024, 10:06am

Ideally, the repo on Quay would look more like fedora-ostree-desktops/silverblue (where we can find all composes by date) rather than what fedora/fedora-silverblue is right now (where you only have the latest tag, which makes it hard to diagnose issues with builds or regressions).

kevin · March 28, 2024, 6:10pm

Yeah. I like that idea… definitely more clear.

siosm · March 29, 2024, 9:25am

I’m only realizing this now, but doing the same with “application” container images would also be nice. The images are much smaller so the storage cost would be low. Ideally we would rebuild those more regularly than we build them now.

I’m not sure how / when we rebuild those container images now.

walters · April 2, 2024, 5:44pm

Actions speak louder than words but I am going to try to do what I can to push for Red Hat to apply more resources to container-based infrastructure in Fedora. We will see what happens from that.

A notable sub-thread in this is that Fedora CoreOS maintains a custom Jenkins pipeline to build containers too.

adamwill · April 26, 2024, 6:26pm

Update here: @siosm , @kevin , @pbrobinson and I did a call and came up with some short- and medium-term plans:

Copy sync-latest-container-base-image to the Fedora IoT compose repo, with modifications, so that compose can publish container images
Extend sync-latest-container-base-image to also publish the Atomic Desktop OCI images produced by Pungi, and get rid of sync-ostree-base-containers
Replace the bash scripts with something a bit more configurable, extensible, and testable, probably in Python

These together are kinda a Phase 1, we intend to re-assess after that’s done and see if we want to move on to bigger plans for integrating CI and reconciling the branched/rawhide and stable release flows.

Topic		Replies	Views
HowTo: Test the Fedora Atomic container images by rebasing Ask Fedora howto , rpm-ostree , kinoite , silverblue , atomic-desktops	11	1062	February 28, 2025
Questions about Ostree and Ostree native containers Ask Fedora rpm-ostree , silverblue	12	785	April 17, 2024
How to build ISOs from bootable OCI containers? Project Discussion silverblue-team , kinoite-team , atomic-desktops-team	8	609	October 6, 2024
OCI based host provisioning (baremetal/virt) Project Discussion coreos-wg	4	338	January 19, 2022
Moving from container image to coreos-installer compatible image (metal) Ask Fedora coreos	5	163	March 4, 2025

We need to come up with a consistent approach for generating and publishing containers: both 'traditional' and atomic desktop containers, both stable and unstable releases

Related topics