Make a Fedora Respins main method of get ISO images of Fedora

I’m know. But I’m want to link to Respins download link at least in “alternative downloads” with marking “unofficial”. It’s make Respins is more popular than previously.

From the QA perspective, running it through openQA is free - it happens automatically (assuming the composes are generated by Pungi as usual). I’d imagine it’d be a bit more immediate work for releng and websites folks. For QA the trickier bit would be that we’d kinda have to care about the results. I’m going to see them, and if there are failures, then what? Do we follow the normal release process with blockers and reviews and all of that jazz? Do we just wing it? Do we just put the images up with a link to the results and say ‘this is the image, these are the tests results, do as you see fit’? Do we only publish images that pass all the tests, or a chosen subset?

edit: note, if we only generate live images, then a lot of openQA tests wouldn’t be run (unless I do rather a lot of work and reinvent something I just uninvented to make the scheduling code simpler) - most of the tests of various installer capabilities run on the Server DVD, not on live images. The tests for live images only do a basic install test, they are more focused on desktop functionality post-install.

1 Like

Well, as is, those problems hit users and end up in Ask Fedora. That’s not really better, more like the kind of better where I joke that we should send @kparal on vacation on release week so he stops finding bugs.

Our process (thank you, Adam and Kamil and all other heroes of QA) demonstrably increases quality over time, but still, after release day it’s a gamble as to whether applying updates will introduce exciting new problems along with fixes. If it’s relatively easy to expand the automation already in place to at least give us more information, that seems like a step in the right direction.

But I also wouldn’t want to put more load on the QA team and take away that between-releases (somewhat) calm. I definitely don’t think we should do anything like the release blocker process.

If we can do so reasonably, I think we should publish any images that pass the same set of tests the release-day images do. As a first pass, this would be more tested than the current respins, and also could be somewhere to look when people report issues.

As a next step, when there is a test failure, we’d automatically back out the updates that caused it. Not counting flakes with the test system itself, that’s either:

  • a regression — which should be fixed by the packager — or
  • a big enough change that the test needs updating — in which case the packager should help update the test, or given the updates policy, maybe the update should wait for the next release (and the test updated in Rawhide).

Ideally we would block just the offending update, but I’m kind of leaning towards: it’s better than nothing to block the whole batch if that’s where we’re at. Then packagers pushing those updates should (he said, full of naïtivity and hope) work together to identify the specific problem.

Uh. Are you aware we do a lot of that already? We already automatically test all critical path updates, and they are gated on failures. If the tests fail the update can’t go stable.

We soft gate Rawhide these days, even - if a Rawhide update breaks stuff I either fix it before the next compose or get nirik to untag it. We just don’t push stuff that breaks tests any more, practically speaking.

The difference with respins is that updates can break the compose and install process. We do actually test this, but only minimally (openQA builds and does a basic install test with an Everything netinst, a KDE live and a GNOME live, but it doesn’t build anything else and for updates it doesn’t run the more extensive installer test suite we run on composes).

I don’t think it’s really right to say that updates are “a gamble” these days, at least for core operations.

Also of note, we do actually test the current respins in openQA. Here are the test results for the last respin, for instance. The agreement between me and the respin maintainer is that the tests get run and it’s up to him what to do with the results; I do look at them and briefly mention what the cause of any failure is to him, if I get time to figure it out.

additional: as you can see there the compose test suite for a Workstation live image is 38 tests. For an update we run 14 Workstation tests (the test that builds a Workstation live, plus 13 tests from the 38 that we run on composes).

Oh, yes, definitely! That’s why I’m hoping it’s not such a big stretch.

That is probably too harsh. I’ve been spending a lot of time reading what’s going wrong for people, which probably gives me a negative bias.

Oh, cool — that I didn’t know.

I knew this for rawhide, because we talked about it. Although we really need to scale up the process — it really needs to be on the person who pushed the update, and failing that a wider group. On general principle, let alone the siren song of the llama farm.

We talked about this too — we probably need a better definition of “critical path”, and some broader layers. And, if some package is supposed to be outside of that path but yet breaks a test…

Using the netinst will install the latest packages.

Just that I wish the options given in netinst will mirror that of the different editions / spins 100% .

Anyway. Fundamentally, if you want to do this, we can probably do it. It’s just a question of defining the parameters and seeing if websites and releng are OK with it.

1 Like

Maybe this is wishful thinking, but my guess is that most of the things that might be caught in this way will hit us later one way or another, and if we can spread out the load beyond QA (and beyond you and Kevin!) that’ll be net positive.

And we can finally make Linus Torvalds happy! (I’d link to his comments, but I think they were on Google+…)

The category of things I’m worried about is ‘updates that break something in the installer which we don’t test in the update tests’. It’s not unmanageable, it’s just a thing that is likely to happen and which we would have to deal with.

1 Like

(lets email reply to two posts on discourse, will it work? lets see!)

Maybe this is wishful thinking, but my guess is that most of the things that
might be caught in this way will hit us later one way or another,
and if we can spread out the load beyond QA (and beyond you and Kevin!) that’ll be net positive.

I fear that wouldn’t happen. I mean, it would be nice to spread the load
of rawhide and updates troubleshooting, but thats not happened yet?

And we can finally make Linus Torvalds happy! (I’d link to his comments, but I think they were on Google+…)

Ha. Ah… memories!

The category of things I’m worried about is
‘updates that break something in the installer which we don’t test in the update tests’.
It’s not unmanageable, it’s just a thing that is likely to happen and which we would have to deal with.

Yeah, so lets step back here (and sorry, this might be long):

What does this gain us? I see two things:

  1. Users don’t have to download an image and then download and apply a
    bunch of images. (ie, we are saving some download BW here).

  2. We may produce a image that has a newer kernel/bootloader/installer
    that enables some new hardare or fixes some common bug.

(did I miss any?)

The first one is nice and all, but once you install, you are still going
to be downloading a bunch of updates moving forward right?

The second one also could be nice, but since we release every 6 months
and the older release is still available, in practice I am not sure how
often that sort of thing happens.

Now, what would it take to do this:

  1. The respins sig is awesome and does great work, but they make images
    on a aws instance we provide on whatever schedule they like. If we were
    going to advertize these as our official images, I would really want
    them to be made in koji and have logs and tracking and such. So, that
    would take more cpu, more disk, etc.

  2. Process. I don’t like process for its own sake, but there’s a reason
    we have process for releases. We would probibly need something for this.
    To answer questions like: Bug/security update X just appeared, should we
    redo this upcoming release? Or skip it? Bug XYZ isn’t going to be fixed
    quickly, do we cancel this release or delay it? Someone has to look for
    issues and propose them (I guess we could reuse the blocker bug setup,
    but would need to adjust for that). We would need to communicate and
    coordinate all the various groups.

  3. Maintainers of boot chain stuff would be ‘on the hook’ more.
    Right now anaconda/lorax/grub2 folks know they need to be available for
    release blocking bugs running up to releases. This could mean that they
    have to make sure and have cycles to handle those anytime.

Anyhow, I agree we could do it, but I think it would take a fair bit of
work and I am not sure the gain is enough. :slight_smile:

I’m of course only speaking for myself here… if fesco/council decided
this was something that was important to do, we could try and do it.

Yeah, that’s a good summary. To me there’s a problem that’s kinda…if we want to make these images more important than the original release images, we need to throw a bunch of process around it to make sure we don’t screw anything up. If we want to avoid all the process, it would be kinda risky to promote these images over the original release images. So, how do we want to square that circle?

Kevin Fenzi wrote:

What does this gain us? I see two things:

(did I miss any?)

I’m primarily thinking of a third thing: increased automated integration testing of updates will improve the experience of end-users, by helping us catch bugs before they become issues.[1].

I hope it will also help spread the problem-fixing load more to packagers. This is related to a conversation with @dustymabe about what they do in CoreOS, where they actively manage and bug-fix their update streams — and where they’d also like help from packagers pushing updates which fail their tests.

Adam Williamson wrote:

If we want to avoid all the process, it would be kinda risky to promote these images over the original release images. So, how do we want to square that circle?

Tepid change for the somewhat better!

We don’t need to have all the process there at once. I think there is value in measuring (that is, running the tests, seeing the problems) even if we don’t promote the images.

How can we make it so you don’t feel like you need to fix problems that such testing might reveal?


  1. and issues before they become problem reports ↩︎

Kevin Fenzi wrote:

What does this gain us? I see two things:

(did I miss any?)

I’m primarily thinking of a third thing: increased automated integration testing of updates will improve the experience of end-users, by helping us catch bugs before they become issues.[1].

I guess so. I think we do a lot already though…

I hope it will also help spread the problem-fixing load more to packagers. This is related to a conversation with @dustymabe about what they do in CoreOS, where they actively manage and bug-fix their update streams — and where they’d also like help from packagers pushing updates which fail their tests.

I don’t know if we block on the openqa tests yet, but we could?

Adam Williamson wrote:

If we want to avoid all the process, it would be kinda risky to promote these images over the original release images. So, how do we want to square that circle?

Tepid change for the somewhat better!

We don’t need to have all the process there at once. I think there is value in measuring (that is, running the tests, seeing the problems) even if we don’t promote the images.

How can we make it so you don’t feel like you need to fix problems that such testing might reveal?

Some group of people stepping up saying they would investigate problems
with this? :slight_smile: If we just say we are going to do it, then I am suspecting
it would just end up adam and I doing it as we do now…

But again, I’m not sure this is all worth it. :slight_smile:


  1. and issues before they become problem reports ↩︎

I guess I’m missing a logical connection here: what is the relation between doing official respins for public consumption, and increase automated integration testing of updates? OK, doing official respins more or less requires us to do a bit more of one specific type of integration testing: testing that updates don’t break the compose and deployment process. But we would only need to do that because we were providing updated images. If we don’t provide updated images, those problems are never going to get “exposed” to end users.

We can do more automated testing of updates in the contexts that they might cause problems for end users without doing official respins. It’s just a question of resources and time to investigate failures. (I’m actually planning to look at extending the set of tests we run on updates already, it’s just another thing on the list of things to get around to - the work on critical path groups was part of this too, to let us be more focused in exactly what tests run on on what updates).

1 Like

Could you elaborate, please? I’m interested in what differences you have
encountered using the netinstall ISO.

I’m not intending to hijack the discussion here. But one of the
solutions to avoiding a large download of the Live ISO and subsequent
large download of updates[1] (if I understood the OP’s issue correctly)
is using a netinst ISO. The netinst ISOs are, imho, a bit undervalued[2]:


  1. especially true when updating late in the release cycle ↩︎

  2. for lack of finding a better term at closing in on 02:00 AM local time ↩︎

I am trying to install via netboot ISO and Workstation ISO to be sure. I want to verify if pass impression still valid with F38 or not.

Just got this issue:

For comparison, with Everything-netinst, Anaconda get past that task without issues.

I had a look at the issue. Will leave a comment there.