Idea: Rescue Kernel as current LTS kernel

The issue consists of 3 parts, combined I think they could greatly improve the stability of Fedora, with a focus also on atomic desktops.

Background

We see a lot of regressions with the very recent kernel versions that Fedora ships. This can likely never be covered by QA, as the issues may not only be hardware specific but also depend on 3rd party software that people use.

There are only 2 options

  • build, test and ship the latest kernel version to stay secure and up to date
  • use the LTS kernel which gets security fixes backported

Fedora is “leading edge” and does the 1st approach, which impacts usability often.

I know that my GrapheneOS/Android phone will always work. Not so sure on my Fedora Laptop.

Fedora is not intended as a beta-tester distro, so there needs to be a way to have the LTS kernel.

LTS/longterm kernel

The official Linux LTS kernel is the one with the most upstream support, while also being stable and reliable.

It has a support period of 2 years (roughly 4 Fedora releases), and is released roughly once a year (2 Fedora releases).

With the drop to 2 years of support, Google (Android) and likely many more stakeholders will invest way more time in testing and fixing the current LTS kernel.

This means it’s quality and stability should improve even more.

Kernel versions

dnf Fedora has 3 kernel versions afaik.

  • rescue kernel
  • previous version
  • current version

While there are no system snapshots, this ensures that there will be a bootable kernel.

Atomic versions don’t have such versioning. There are snapshots for the system

  • previous system
  • current system
  • (optional) x pinned deployments

This is problematic, as there can be cases where there is no older backup kernel version.

LTS Kernel as backup and rescue

I am not sure what version the rescue kernel is. But afaik it is some random and unpatched kernel version, that accidentally was the latest stable at the time of the Fedora release, and shipped with the Anaconda installer.

This is pretty strange, as the rescue kernel should probably not be used? This also sounds like it makes old Anaconda ISOs insecure.

Why not have a kernel-longterm package and install that alongside, allowing to use it as rescue kernel on atomic and mutable Fedora? Maybe also on the Anaconda ISO.

There is this COPR by @kwizart that already does the packaging successfully.

https://copr.fedorainfracloud.org/coprs/kwizart/kernel-longterm-6.6

It would then also be great to be able to use the LTS kernel as main one, as this will still guarantee security, but also more stability.

Personally, the kernel is the biggest troublemaker on my system, and I dont really need to always have the latest and greatest.

Using an older non-longterm kernel is not an option as the maintenance period is so short, so the LTS kernel is the only alternative.

Potential issues

Current mesa and drivers could cause conflicts.

But I assume especially with slowly-moving ones like the nvidia driver, a LTS kernel would guarantee longer support.

Having the packages separated between the 2 currently supported LTS versions (kernel-longterm-6.6 and kernel-longterm-6.12 afaik) would allow users a safe backup. But this could require more complex upgrade mechanisms, at least on mutable Fedora.


What do you think?

2 Likes

LTS is not as useful as a rescue based on the latest successfully booted kernel. The LTS may be missing features that a user needs like newer device support.

1 Like

the rescue kernel is separate from the previously installed one afaik. Or is this just an alias that points to that?

6.12 is going to be the new LTS kernel.

It would be great to have an official LTS kernel package that people can install as a (relatively) stable option besides the rolling kernels.

1 Like

I was not saying what the current situation is. I was saying that a recent working kernel is better then an LTS for its device support, especially on laptops that see fixes often in new kernels.

Yes that was not the point.

You can still have the previous stable kernel as backup, but use the official LTS kernel as rescue kernel.

And also offer to use it as the main one, in a second step.

First would be

  • integrate the COPR package into main Fedora repo
  • include it in testing
  • include it in workstation etc
  • replace the rescue kernel with it

Then the mechanisms for atomic desktops (which afaik dont exist at all)

  • auto-pin and unpin a deployment depending on kernel updates
  • have a backup version using the LTS kernel?

Then things to make using the kernel as main kernel easier

  • switch the kernels on mutable and atomic

First step would already help a lot but this is probably bigger than I have thought

Sorry I am not explaining my point well.

I am suggesting that the snapshotted rescue would be better with a recent working kernel not the LTS.

That is independent of which kernels you keep.

1 Like

Yes I guess that makes sense.

It doesnt make sense to use a vulnerable kernel on the Anaconda ISO. And it would make sense to use an LTS kernel in general.

And a rescue kernel is not the backup one, one version behind, afaik.

But that one version behind should of course be kept.

The rescue kernel is built with the kernel that was initially installed unless it has been manually updated.

Related but not necessarily the same thing …
It would be nice if there was some better tooling to do some of this.
You can freeze/sticky a kernel on mutable using grubby and by scripting some things in /etc/kernel/install.d
You can exclude things from updates in dnf but gnome software and discover (packagekit) do not utilize those settings.

Granted a lot of folks here are able to set this up without issue but it’s not intuitive especially for beginners.

2 Likes

Yes and my proposal would be that the current LTS kernel would make way more sense as a rescue kernel.

Simply because it is stable (no new breakages) but gets security backports.

In a second step people could then do stuff like use that as their main kernel.

Would’nt it be better to just have an option to create a rescue kernel from the last known good kernel on the device? … Maybe something like a prompt when upgrading to a new kernel, "You are about to install a new kernel, would you like to create a backup/snapshot/rescue kernel from the currently installed kernel?(Y/n). From my experiences, the rescue kernel usually has to be manually regenerated anyways, why not make use of that?
I am in the habit of doing this already. I see new kernel/kernel update, regenerate rescue using current installed working kernel… my workflow for this is:

  1. dnf upgrade -y --exclude kernel*
  2. IF new kernel available -----
    rm -f /boot/rescue
    /usr/lib/kernel/install.d/51-dracut-rescue.install add “$(uname -r)” /boot “/boot/vmlinuz-$(uname -r)”
  3. dnf upgrade -y
  4. reboot ---- all ok, go on with life else figure out how to fix it … having available a known good working kernel just in case
2 Likes

Rescue kernels should be managed automatically in the background. I disagree that using an outdated but kinda recent kernel is fine?

This might work but really only the recent kernel is supported if you miss a few updates.

Hi Squirrel, :slight_smile:

As far as I can tell, my rescue kernel only gets updated when I do it manually … been that way for as long as the rescue kernel has been around as I remember. IF I remember correctly, the only time the rescue kernel gets updated automatically is if you have dracut-config-rescue installed AND remove the rescue kernel before you update the kernel (which gets you into a situation of creating a rescue from a broken kernel). Otherwise rescue does not get automatically regenerated with kernel updates. So, would be better to regenerate rescue manually from a known good/working kernel just in case you get lazy (which I do occasionally) and go past the typical “keep 3”, at least you have some chance at a relatively quick/simple way to recover …
On this particular “Idea” though, I think adding an option to make basically a backup kernel that is known to work is a good idea instead of trying to maintain LTS in the repo, particularly considering that part of the idea about Fedora is to be as close to new/current tech as possible.

I disagree; LTS is not magically better, it is often missing key hardware support.

At the moment LTS is 6.12 that is plagued by annoying bugs for example, as shown by users posts here.

1 Like

This is not really an issue. LTS kernels are not a decade old but on average about 1 year, they will be maintained upstream for about 2 years . There is little completely new ‘key hardware’ added in that timeframe. If you do have bleeding edge hardware that requires the newest kernel, then you have that choice.

This is misleading.

It’s not as if the current mainline kernel halfway the 6.12 series is going to be the kernel that LTS users will be getting for the next years. It will be revised, bug-fixed and security-patched, until it is as stable as possible for LTS use.

The reason the current 6.12.x kernel is buggy – and let’s be frank the last 5-10 kernels have been really buggy – is because they are not LTS kernels. Development is going fast nowadays and there have been a lot of really unstable, sometimes unusable kernels for large groups of people using Fedora recently.

A properly vetted LTS kernel would never be that unstable in terms of basic operations such as suspend/resume, GPU stability, CPU scaling etc.

2 Likes

6.12 is an LTS kernel as I understand it.

Yes but LTS mostly means that it will be feature-frozen, so there will be no new potential regressions as regularly happens with the mainline kernels. It will only receive bugfixes and security backports for the supported period, which is currently set to 2 years (previously 6 years) by the stable kernel maintainers.

Yes is is. If someone would use LTS kernel it will use the older version 6.6 to avoid hiccups. This means there are less users testing the newest kernels while this means less proved fixes to include into the next LTS version what means LTS is not so good as it is today.

An other point is that, LTS kernels are more used, for systems which not change so fast as Mobile & Workstations in general do. And also the Enterprise market will have an advantage compared to the Stable cycle of kernels.

Decreasing LTS from 6 to 2 yer support also shows that the responsibility to keep a kernel running so long is a big load and it gets expensive.

As much as I would like to see the LTS kernel in the official Repositories of Fedora, as much I am also happy “to be different” to other main Distributions which already have this practice, using the LTS kernel. We would probably loose our agility we do have now, being fast and going after new technologies. We would also slide into a bigger competition with our Enterprise sponsors, while we do aim to be more complementary at the moment, testing new technologies for them.

Probably, we should talk about, how we could being better to release new kernel versions including the patches which cause this regression. To keep our stable version more stable and focus on that, instead of reaching for the LTS versions.

A rescue kernel is to rescue a system and should not be used for productive manner. As already mentioned a rescue kernel has to be good enough to boot the system and rescue it.

1 Like

The problem is that kernel releases go so quickly, with a new subversion every 1-2 weeks, that it’s impossible to keep up with testing. Version 6.x.1 will have an AMD regression, then 6.x.2 an Intel regression, 6.x.3 will not resume from suspend for X% of users and so on.

As Fedora users we are guinea pigs for the stable distributions downstream, which i understand, but the kernel instability that has been visible the last 6 months or so is a bit concerning. Having a reliable kernel should imo be more important than to be bleeding edge, because with an unstable system all bets are off.

It wouldn’t be necessary to be so unstable if Fedora’s kernel maintainers decided to not push through every kernel subversion but only 1 or 2 in every major cycle, after more thorough testing for major defects (cpu/gpu/suspend etc). But i suspect that’s not going to happen since the current approach seems to be a fundamental part of Fedora’s development cycles.

For my personal situation, and i have a very generic full-AMD laptop, the recent kernel updates feel a bit like Russian roulette. Every week i’m hoping that i can at least boot my computer. If i had to use my laptop professionally, which i currently don’t, i don’t know if i would take that risk.

I see your point here. But 6.12 would be the LTS candidate. Android for example will wait a few months before switching to it.

Meanwhile users would have 6.6 which is stable and LTS and could be used as a safe option if the current stable kernel is full of bugs

So you kinda invalidate your point here. The fact that 6.12 is so buggy proves my point. Using 6.11 is already long end-of-life!