I’ve been facing kernel-install script failures consistently for months, and have been trying to track the issues progress since I first faced it. I’ll avoid duplicating information that already exists, so here are the relevant references:
The main bug report appears to be 2274356 opened on 2024-04-10
A solution has been proposed in 2276271 on 2024-04-21
I absolutely love that Fedora does so much work upstream and 100% understand that upstreaming changes is a lot of work that cannot and should not be rushed.
Furthermore, Fedora has a complex Project Management with lots of processes compared to other distros I’ve used in the past, and I haven’t yet come to be 100% familiar with all of them.
In this case, someone appears to have already put some time to investigate, find what appears to be a logical error in code (variable defined but not referenced), composed and tested a fix and provided a patch to apply it.
So I’m just trying to understand what (if any) process exists in Fedora Project that hampers or discourages downstream “quick-fixes” that could alleviate pain-points for users.
In general we try to stay as close to upstream as possible:
However, that’s a recommended guideline and in many cases, it may be more appropriate to carry “downstream only” fixes.
I cannot comment on the particular case you’ve noted here. Have the maintainers explicitly noted that they’ll wait for the fix to land upstream, for example?
A solution has been proposed in 2276271 1 on 2024-04-21
I read that bug and its got activity in last couple of days.
The solution proposed in April seems to have issues upstream, if I followed the discussion correctly.
What you are saying could very well be true.
I’m not questioning that.
What I’m mostly trying to understand, is if and possibly why, Fedora processes are so encouraging of upstream work that hinder the healthy operation of active user systems. Is it really a reasonable stance and healthy for the project that upstreaming takes priority over the operability of active systems?
Wouldn’t it be more reasonable if steps in tackling issues looked a bit more like:
Investigate
Deploy Downstream Quick-Fix (this just needs to consider Fedora System Operation)
Work on resolving the underlying Upstream issue (that are way more complex and wide-scope)
Edit:
By the way, my understanding is, the upstream issues are with dracut-102.
Today on F40 we are running:
>> dnf list installed dracut
Installed Packages
dracut.x86_64 101-1.fc40 @updates
Which again shows more interest in upstreaming (which is fantastic) but less care for downstream (which is not as fantastic )
I read the linked bugs, and how I understood the maintainer’s position is “I will concentrate on handling the tricky version update that will also fix this issue. If somebody wants to have a downstream quick fix patch, please submit a PR.” This based on this comment, which even tells where&how to submit one. So I don’t think this is about a policy, but people spending their precious time on what they see as the most important, and letting others spend theirs how they see fit.
Probably the best way to get the issue fixed is to go and submit that PR.
If that’s the case and no specific QA processes exist in place for this, then that’s indeed completely understandable and of course I respect how everyone wants to manage their own precious free time.
My train of thought was something along the lines of:
An issue like this would have possibly been a freeze exception and assigned some priority had it been near a release (or I could be wrong?).
Since Fedora has so well structured QA with its releases, there is probably QA processes that are running during a release’s lifetime.
If there indeed exist running QA processes, then one/some of those processes must be blocking downstream fixes in favor of upstreaming.
With the information provided by your answer, I suppose I was just misguided by false assumptions.
What you are saying could very well be true.
I’m not questioning that.
What I’m mostly trying to understand, is if and possibly why, Fedora processes are so encouraging of upstream work that hinder the healthy operation of active user systems. Is it really a reasonable stance and healthy for the project that upstreaming takes priority over the operability of active systems?
I think it’s just a set of things here that you hit.
From the fact that this package is right in the middle of a transition
to a fork, so they are focused on that, to the fact that systemd-boot
isn’t a default configuration, so it’s likely to only affect a small
number of folks that have specifically enabled it, to eveyone being
busy.
Wouldn’t it be more reasonable if steps in tackling issues looked a bit more like:
Investigate
Deploy Downstream Quick-Fix (this just needs to consider Fedora System Operation)
Work on resolving the underlying Upstream issue (that are way more complex and wide-scope)
I think thats a more normal workflow to see normally, just in this case
the maintainers have a lot of other things going on, so step 2 isn’t
something they have time for.
If that’s the case and no specific QA processes exist in place for this, then that’s indeed completely understandable and of course I respect how everyone wants to manage their own precious free time.
My train of thought was something along the lines of:
An issue like this would have possibly been a freeze exception and assigned some priority had it been near a release (or I could be wrong?).
Anything could be proposed as a freeze exception, so sure it could have
been.
Since Fedora has so well structured QA with its releases, there is probably QA processes that are running during a release’s lifetime.
If there indeed exist running QA processes, then one/some of those processes must be blocking downstream fixes in favor of upstreaming.
With the information provided by your answer, I suppose I was just misguided by false assumptions.
There’s processes in place for updates. Ie, if an update causes a
problem it will be flagged and may not be pushed out as an update.
But thats of course subject to what the update fixes, what items are
broken, etc.
Just to be clear, apart from the kernel, which does not allow out-of-tree patches, there are no policies or processes in Fedora forbidding downstream patching. I would say the guidance, already linked above, boils down to:
Do not patch if you do not have to
If you actually have to patch, use upstream patches if possible (so merged but unreleased commits or pull requests, or such)
If you really write a new patch, do submit it upstream as well, if possible
But if there really is no reasonable solution that upstream would take, downstream patching is allowed.
It is true that the third item sometimes slows down fixes, because it asks to avoid Fedora-specific shortcuts that upstream would not accept. But it is not the case here: apparently, upstream already has a non backportable fix in the latest version, and does not support older releases with bugfixes. So this case falls through to 4, just waiting for somebody to do the work and tackle the prosesses for getting a new package release out.
I would say it is quite common for Fedora packages to have some patches that refer to open upstream pull requests, submitted by the Fedora maintainer, and for those PRs to eventually get merged, so that Fedora patches can be removed when the next version is out.
Part of the problem is caused by the special Fedora/RedHat way of handling grub updates. So should upstream of dracut make special consideration for Fedora, or should Fedora just adjust to the upstream. The latter approach is not particular difficult, but it needs co-operation from the Fedora Grub team.