F39: Not booting, blank screen after update, stuck with 'This OS version is past its end-of-support date'

A computer running Fedora 39 is unable to boot after installing updates using plasma-discover (KDE). It shows a warning about being eol and that’s it.

Note that it has not been upgraded yet because of new bugs observed on other Fedora installations that were upgraded to F40 and beyond, including:

  • Not booting anymore because initramfs image for newly installed kernel version was not created during the update: error: file ‘/initramfs-…img’ not found. 2
  • Some keyboard shortcuts not working anymore (like Guake, KDE)
  • Keyboard layouts disappearing or layout switching sometimes not working
  • Keyboard input randomly stopping to work altogether for some programs (like Leafpad) while still working in others like KWrite (similar issue)
  • KDE panels not on screen edge anymore (menu not “a mile high” anymore), context menu on windows now listing workspaces that the window should be mirrored on where “move to workspace” used to be…
  • Desktop session started with Wayland after upgrade, KDE package needs to be installed manually and selected on login screen to get X11 back (plasma-workspace-x11)
  • Tray icons disappearing (except default ones)

Those points might be off-topic but then again, I feel like they should be mentioned somewhere as it’s probably not worth it creating a topic for each one individually. Anyway, this is why this particular Fedora 39 installation has not been upgraded yet. Unfortunately, a simple update rendered it unusable for a while.

After a long search and rescue operation, it turns out that the Nvidia graphics driver somehow broke the system after this update. Since all similar questions I found suggested removing all nvidia packages which I would not recommend doing before checking if this driver is indeed the culprit (what if that was not the reason, you’d have one extra problem that might cause boot issues) - here’s a workaround:

In the Grub boot menu, hit E to change the boot options. Go to the Linux boot line which starts with initramfs and ends with:

... rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau

Remove both blacklist instances or replace nouveau with nvidia. Also, rhgb quiet may be removed to see more information during the boot process. This actually helped, the system would boot normally, login screen appears, KDE works etc. The only difference at first sight seems to be that in some rare case, the mouse cursor would be slow to move. Also, after some minutes of inactivity, the screen is dimmed which is a KDE setting that should not be on but seems to be enabled after the update. Otherwise, everything seems to be working with the Nouveau graphics driver.

To confirm that it’s actually being used:

$ inxi -G | grep NVIDIA
  Device-2: NVIDIA GP108 [GeForce GT 1030] driver: nouveau v: kernel

Now, this is a workaround but the question is: What’s the real fix? And is it a known issue that the Nvidia driver might prevent Fedora from booting after updating?

# rpm -qa | grep nvidia | grep -v 'kmod-'
nvidia-gpu-firmware-20241110-1.fc39.noarch
xorg-x11-drv-nvidia-kmodsrc-560.35.03-5.fc39.x86_64
nvidia-modprobe-560.35.03-1.fc39.x86_64
xorg-x11-drv-nvidia-cuda-libs-560.35.03-5.fc39.x86_64
xorg-x11-drv-nvidia-libs-560.35.03-5.fc39.x86_64
nvidia-persistenced-560.35.03-1.fc39.x86_64
xorg-x11-drv-nvidia-xorg-libs-560.35.03-5.fc39.x86_64
nvidia-settings-560.35.03-1.fc39.x86_64
xorg-x11-drv-nvidia-power-560.35.03-5.fc39.x86_64
xorg-x11-drv-nvidia-560.35.03-5.fc39.x86_64
xorg-x11-drv-nvidia-cuda-560.35.03-5.fc39.x86_64

# dnf info nvidia-modprobe-560.35.03-1.fc39.x86_64 | grep repo
From repo    : rpmfusion-nonfree-updates

Generally, the way to deal with a driver not working with the most recent kernel is to revert to using the previous kernel (select the previous kernel from the boot menu) and, if necessary, file a bug report to get the driver updated (or the kernel fixed). There is a reason that Fedora Linux keeps the previous two kernels installed and available as a fallback option.

Any kernel driver can (potentially) prevent the system from booting. There probably isn’t much that can be done about that since they need direct access to low-level memory areas to function. Linux kernel drivers are modular though and it is (usually) possible to blacklist one that is not working and use an alternative driver (as you demonstrated).

1 Like

That didn’t help, wouldn’t boot with the previous kernel version either.

Indeed, I agree. In fact, this is why the installonly_limit setting was set to a higher value than 3.

I agree, but then again… We’re not talking about some obscure slide scanner from 2003 with self-compiled driver packages or something (slightly exaggerating here). It’s the graphics that stopped working, Nvidia, which probably half the world uses (again, might be slightly exaggerated but you get the point). With the official driver package. I’m arguing that in 2024, graphics should just work without expecting the user to 1) get the right idea that it might be the graphics driver, not the eol message on the screen; 2) modify the boot options; 3) depending on the situation or who you follow, remove nvidia packages, install nouveau or just make the boot change permanent.

In other words: It’s such a standard setup with a common graphics card, I would expect lots of other users being affected as well and so there should be a known bug report somewhere. So, is there something I missed?

Apart from that, I’ve noticed that there is a bunch of nvidia packages installed (plus some older kmod packages), from the rpmfusion repository. The version number seems to be 560 but on the NVidia site, 550 seems to be the latest production-ready version if I read it correctly.

That really shouldn’t happen. Unless the kernel and initrd files on the boot partition have been modified. I consider it a major bug if the driver updates are modifying your known-good fallback kernels. Can you look at the timestamps on your initramfs files and see if the older ones were changed? Edit: You might also check ls "/lib/modules/<kernel-version>/extra" to see if the drivers were updated there. Again, they shouldn’t be for older kernel installations. But if they have the same timestamps across several directory trees, then the driver updater is doing something it shouldn’t.

I understand. That said, the kernel devs and/or the driver devs can be a bit reckless (or over-enthusiastic about getting the latest and greatest power saving feature to work) at times.

Since you are running an EOL version, it is likely that the problem has already been reported and fixed and the answer you would eventually get if you reported the problem would be to update your system. But you’ve already indicated that you are not ready to do that, so the only other option is to go back to the previous kernel.

Linux is trying to support a lot of hardware that wasn’t designed to work with Linux. There are other operating system companies that pay the hardware manufactures to make the hardware work with their OS. To some extent, it’s the trade-off with Linux that “you get what you pay for”. Linux is often an afterthought for hardware/firmware developers.

1 Like

It is a known fact (or at least reported here several times) that the 550 or 560 drivers installed from nvidia do not function properly with the newer fedora kernels. Installing the nvidia driver from rpmfusion gives version 565 for the newer kernels.

Recently it was noted that rpmfusion has discontinued the repos for several versions
of fedora that are EOL and they needed storage space.

Upgrading to at least f40 seems quite reasonable. (f41 would seem a better choice to me)

Agreed 100%.
However reality is that the OS manufacturers upon which the hardware vendors rely have an arrangement where the drivers are guaranteed to work with their OS. The big kahuna tends to get priority since they have the lions share of the market and lesser entities such as linux have a much smaller share of the market so their needs are secondary to almost everyone.

Some hardware manufacturers do not support linux at all. For those cases drivers may be reverse engineered but are seldom as good as those provided by the hardware source.

2 Likes

Where do I find this information, is there a “known bugs” section somewhere listing (in)compatible kernel versions? Seems like a major issue but which kernel update should be avoided if using the nvidia driver version 560? I guess 6.6.11, what about 6.5.6… I checked the rpmfusion page but couldn’t find such a warning there.

Thanks, good to know. Though if they discontinue Nvidia support that early (39 was released a year ago), it makes me wonder if there’s a long-term benefit using rpmfusion to install those drivers as opposed to installing the official drivers from Nvidia directly.

As I wrote above, I delayed this upgrade due to various new bugs, one being Fedora not booting anymore and with this in mind, I do not consider it reasonable to upgrade, unless there’s extra time to fix bugs. However, previous release upgrades did not cause such fatal bugs as far as I remember and generally, I still prefer the non-interactive nature of Fedora upgrades compared to Debian etc. I digress.

About that, quote:

Is 565 the version that should be used with kernel 6.6.11? I’m not sure if upgrading to a driver which has such a warning is a reasonable choice, though.

Again, this is about graphics, i.e., standard hardware. And Nvidia supports Linux, when selecting card model and operating system, it says that 550.142, released Dec 17th,
is the recommended version and there’s a manual download option. I’m considering installing that one.

By the way, with the Nouveau driver, the slow mouse issue is getting annoying. At times, on dark pages, it’s actually very difficult to move the mouse across the screen.

I’m sorry, you’re right. Selecting a previous kernel version does indeed work now (btw. the timestamps have not changed). Don’t know why it didn’t work before, could be an oversight or being impatient, being under time pressure. Sorry about the confusion.

That’s what I wanted to find out when I opened this question.

No list I am aware of, but the issue is the driver source as far as I can determine.
The drivers from nivdia seem to be the ones having issues and the drivers installed from rpmfusion seldom seem to have issues.

The 550 and 560 drivers were replaced on rpmfusion quite some time past. If using f41 the only drivers they provide are the 565 version.

As I understand it if you keep your system updated (both drivers and kernels) there are seldom serious problems with graphics. In fact the most recent problems with not being able to boot into a newer kernel is due to the use of drivers from nvidia (or other sources) that use dkms to compile the drivers. It has not happened with those whose drivers are compiled using akmods (rpmfusion).

These are not “fatal” bugs. There have been workarounds available for everyone as I recall. It does require the user to fix the issue.

Bugs will appear continuously, some one type and some another type. Those bugs get fixed and things move forward. Refusing to upgrade and remaining on an EOL version blocks you from all future upgrades, bug fixes, etc. Nothing ever gets changed on a release version after it is designated EOL.

Your choice!

1 Like

I’d just like to point out that it was just that, a system update, which broke it so that it wouldn’t start anymore. By the way, I’m not someone who never updates anything, in fact I sometimes have to convince others so they would let me update their workstation. I also agree with most of the rest and that it’s rare, but that doesn’t help much, I was hoping for a specific “known issue”, something that would allow me to know which kernel version I should avoid for now and when it’s safe to upgrade without making it worse.

How is it not a fatal bug if the system won’t boot anymore after installing updates? I’m sorry, I agree with almost everything else but not this.

Also, since you’re mentioning F39 being EOL again. I’ve listed some of the bugs which made me stop and wait. You see, I’m generally not refusing to update and I would’ve upgraded long ago if there were clear statements about those bugs being fixed but without anything like that, I’d have to rely on the hope that it’s been taken care of. Again, I upgraded another Fedora system and it failed to boot for a different reason, so my typical reaction is to wait until we know that this is fixed instead of repeating it on another system breaking it in the same way. You’ve pointed me to another bug, the 565 drivers possibly failing altogether when using DVI. What if I don’t wait for a fix, upgrade and then I’m bit by this bug?

Btw. when it happened, I scrolled through the journal but did not find anything related to the graphics driver.

This has been fixed. I just updated my kernel and a new initramfs was generated:

# ls -al /boot/788e20240aaf43f46f617cde4b703bd1/6.12.8-200.fc41.x86_64
total 96456
drwx------. 2 root root     4096 Jan  9 18:23 .
drwx------. 6 root root     4096 Jan  9 18:21 ..
-rw-------. 1 root root 82094291 Jan  9 18:23 initrd
-rw-------. 1 root root 16664936 Jan  9 18:23 linux
# rpm -q dkms
dkms-3.1.4-3.fc41.noarch

As for the rest, I cannot say because I do not use KDE.

There should be bug trackers for each identified bug that will indicate its status. They mostly exist on github.com and gitlab.com, but there are others scattered about and it can be difficult to find the right tracker. If you know the RPM package that contains the software, you can often find the tracker listed on the URL field in the output from rpm -qi <package-name>. For example:

$ rpm -qi sway
Name        : sway
Version     : 1.10
Release     : 1.fc41
Architecture: x86_64
Install Date: Tue Jan  7 22:33:31 2025
Group       : Unspecified
Size        : 882059
License     : MIT
Signature   : RSA/SHA256, Sun Oct 27 21:35:12 2024, Key ID d0622462e99d6ad1
Source RPM  : sway-1.10-1.fc41.src.rpm
Build Date  : Sun Oct 27 21:30:18 2024
Build Host  : buildhw-x86-16.iad2.fedoraproject.org
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://github.com/swaywm/sway
Bug URL     : https://bugz.fedoraproject.org/sway
Summary     : i3-compatible window manager for Wayland
Description :
Sway is a tiling window manager supporting Wayland compositor protocol and
i3-compatible configuration.

If you want to test a system update, but be sure that you can reverse the update in case a bug still exists, there is a way to do that with Btrfs snapshots: Make use of Btrfs snapshots to upgrade Fedora Linux with easy fallback - Fedora Magazine

1 Like

That’s great, thanks for confirming!

Now if I knew for sure that version 565 of the Nvidia driver is compatible with whichever kernel version currently ships with F40, I would consider upgrading now. But then again, I’ve just learned that this version of the Nvidia driver might break DVI support and I’m not sure if I’m ready for that.

Some of the KDE issues are ui glitches rather than fatal bugs. Those alone would not stop me from upgrading.

I know there may be other trackers (I believe Nvidia doesn’t have a public bugtracker but there’s a user forum). I came here thinking other affected users of Fedora 39+ would be most likely to see this here.

Re: BTRFS. Interesting idea, but quite a manual process. Maybe time to test that Zypper equivalent mentioned there.

1 Like

If anyone is still reading this, I took the time to make a full backup, set up automatic btrfs snapshots and went over some of the known bugs I might have to tackle when upgrading. Then I upgraded to Fedora 40.

Now, at least the top kernel installed during the update works with the Nvidia driver, it looks a bit different during boot but it works. But the update broke standby and hibernation somehow, both modes have been used for years and now the screen just stays black when waking up from sleep.
I know this is getting off-topic but it’s like the confirmation I did not need for my earlier concerns about updates. Am considering a rollback.

As far as I know, hibernation has never really worked reliably in Linux. I wouldn’t skip security patches just to have hibernation, but that’s just my opinion.

That’s not true. It wasn’t always properly configured by default but it worked. You just need to have the resume kernel argument pointing to your swap partition and that partition must be sized to that the compressed image fits in there, up to the size of the ram.
I have been using it forever. A typical use case would be a laptop that’s put into hibernate over the weekend to continue working on Monday.

In fact I’ve just tried it again and the first thing I almost forgot is that ever since this last
upgrade, one of the monitors stays off during the boot process and unless I manually remove the “rhgb quiet”, all monitors are off, so the luks prompt for the encryption password is not shown. With the modified boot options, the system boots and works, hibernating even works to the point that there will be boot messages loading the hibernate image. And the screen does not stay all black, there’s the mouse cursor, just the login screen is not showing up. So hibernation actually still works as it always worked because when switching to another tty and logging in, I can see that all applications are still running. I tried switching between sddm and lightdm (found one error about it being crashed in the journal), still doesn’t work. Any idea how to find out why the login screen is not showing up anymore after waking up from hibernate mode ever since the last upgrade?

Glad to hear hibernation works on your system.

There have been several reports along those lines lately on this forum. Here are a few that came up with a quick search:

After some experimentation, I’ve noticed some extremely weird behavior. Someone somewhere mentioned that the boot option nvidia-drm.modeset-1 should be removed. That alone did not help, however without that option I was able to get the login screen by pressing CTRL + ALT + DEL, then CTRL + ALT + ESC. CTRL + ALT + ESC alone did not do it, sometimes I had to press CTRL + ALT + DEL twice before CTRL + ALT + ESC, I don’t see a clear pattern there and I’m not even sure if the boot option is really related although I could not get the login screen with that option still in place.

Thanks for the links but they seem unrelated. It’s not Wayland vs. X nor KDE vs. Gnome and, again, the screen isn’t all black but the mouse cursor is there. Also, there’s a noticable delay of up to half a minute ever since the upgrade whenever the login screen should show up (even without sleep), during that time not even the caps lock led can be turned on or off. This might be a good time to open a new thread here I guess, but I did want to mention this new fatal bug here, which came after the upgrade (which I installed to fix the error introduced by the previous update), for anyone out there who might be faced with the same problem.

It’s so strange that the login screen now appears sometimes after using those key combinations.

1 Like

Adding a correction as I cannot edit the last post anymore.

The previously mentioned keyboard shortcuts are wrong and do not help to recover the broken login screen after waking up from standby. It’s unknown why it worked once but it definitely doesn’t work anymore.

Also, neither standby nor hibernate is the actual problem because both still work as always, however when waking up the login screen does not appear. All monitors are black, only the mouse cursor works. Switching to other ttys is possible but the running X session with all applications is locked, unreachable. A restart is required, losing all unsaved work.

This is still unresolved, although it seems to me this may not be the best forum for this issue. Adding some more details, just in case:

  • Booting the old OS version is not a permanent solution because then, other packages are not being updated anymore. There are some programs that need to be kept up to date.

  • The update installed trying to fix the original issue (itself introduced by an update) broke standby/hibernation which also renders the system unusable because it’s almost daily being put into standy mode to save energy plus hibernate mode is sometimes needed to turn it off during maintenance work without losing work.

  • Therefore the system setup is being evaluated and it may be reinstalled with a Debian-based LTS distro. In this process, I’ve also noticed that in F40, with KDE Plasma 6.3.3, Framework 6.13.0 and nvidia-570.133.07, Wayland vs. X does seem to play a role (I previously wrote that it didn’t, that doesn’t seem true anymore). Starting a KDE Plasma/Wayland session and then going into standby, the login screen immediately showed up when waking up. With a regular KDE Plasma/X11 session running, the login screen would not show up. Switching display managers (sddm lightdm) does not make a difference. Don’t know if this updated version of Plasma/Wayland is compensating for something that’s still broken when using regular Plasma/X11.

  • A similar bug report explained that the Nvidia driver apparently isn’t saving the full graphics memory anymore unless the following option is set, however that doesn’t seem to be what’s happening because setting this option does not help either:

    options nvidia NVreg_PreserveVideoMemoryAllocations=1

I’m just adding those hints hoping they’ll be useful for some future visitor. However, nobody who is experiencing the same issue has commented in this thread, so this may well be one of the last updates and the thread could probably be marked as stale.

1 Like