Fedora hangs on boot after upgrading to kernel 6.3.4

Here is the link to my bug report - 2212012 – Fedora 38 workstation hangs on boot after upgrading to kernel 6.3.4

When filing the report, I wasn’t asked to provide any specific logs, but I included my system’s output from fpaste --sysinfo --printonly and cat /proc/sys/kernel/tainted as an attachment.

I would be happy to provide any other logs or other system info if needed. Thanks!

See link above/previous post

The kernel provided in 2211784 – fedora 38 kernel 6.3.4 boot fail upon upgrade seems to solve the issue for the users that tested it so far.

Please review the ticket first to check if your problem is the same. If so, and if the kernel works out for you, feel free to provide feedback in the bug report. This fosters to get the related fix asap in the next official kernel that will end up in your updates.

Hi can you pls provide a reference, link on how to do that. Never had to install kernel outside package system before as a test. \Don’t want to stuff that up.

If the issue of 2211784 – fedora 38 kernel 6.3.4 boot fail upon upgrade applies to you and if you want to test the kernel that Justin provided, you can install that one by
sudo dnf update https://kojipkgs.fedoraproject.org//work/tasks/3237/101713237/kernel-6.3.5-201.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/3237/101713237/kernel-core-6.3.5-201.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/3237/101713237/kernel-modules-6.3.5-201.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/3237/101713237/kernel-modules-core-6.3.5-201.fc38.x86_64.rpm https://kojipkgs.fedoraproject.org//work/tasks/3237/101713237/kernel-modules-extra-6.3.5-201.fc38.x86_64.rpm
→ these are from https://koji.fedoraproject.org/koji/taskinfo?taskID=101713237 , which is the descendant of https://koji.fedoraproject.org/koji/taskinfo?taskID=101713227

Be aware: This kernel is only for x86_64. dnf will check if the packages fit your system. This should look like a normal kernel update: dnf downloads the packages, shows you what it will install and what it removes, then you can chose Y/N and then it installs. Do not skip any errors or warnings! If any come up, let us know here.

Please check in advance that another kernel that works remains installed! E.g., if the last kernel that works for you is 6.2.15., check if dnf will remove 6.2.15 before chosing Y (this can happen if you have deployed already earlier 6.3.X kernels or so). By default, Fedora keeps only the most recent 3 kernels. If that issue happens, I suggest to change (in advance to the dnf command) the option installonly_limit=3 to installonly_limit=4 in /etc/dnf/dnf.conf, which increases the number of kernels that remain installed to 4.

Also, it can happen on your systems that dnf does not install all packages that I have put in the dnf command. If dnf chooses itself to ignore some, this is ok. You can proceed as long as no warnings/errors come up.

Also, be aware that this kernel is experimental. This is why it should be used for testing the bug fix only. Once some people confirmed that it works, the fix will be pushed through the whole testing processes of Fedora until it ends up in one of the next kernels to be deployed in “production”.

@rairai9 I know you have tainted = 0, which indicates that you have no nvidia. But given the correlations of the occurrences we have so far, it might be still worth a try if this kernel helps you too → Fedora hangs on boot after upgrading to kernel 6.3.4 - #24 by py0xc3

Alternatively, if it does not help, you can check if that is indicative for you: Can't boot after update to kernel v6.3.4 - #5 by computersavvy

Got work to do now. End of today, Australian time, Monday, I’ll have a go at testing this. I hope after my timeshift upgrades that it will take me back if ‘shit hits the fan’.

installonly limits is already at 10.

Bit anxious though - never done this before.

Thanks for the info.

@py0xc3 I tried both the experimental kernel and the kernel command line options suggested in Can't boot after update to kernel v6.3.4 - #5 by computersavvy and unfortunately neither worked for me. The experimental kernel still boots to the same black screen with a single underscore (_) that I see when attempting to boot 6.3.4. I also just installed 6.3.5 as an update via GNOME software but that kernel does not boot either and results in the same black screen with an underscore that I see when trying to boot 6.3.4.

I still have kernel 6.2.15 installed and will keep using it for the time being since none of the newer kernels boot on my system for now. I also added a comment to my bug report explaining that kernel 6.3.5 hangs on boot the same way that 6.3.4 does.

Thanks for all of the suggestions so far and please let me know if there is something else I can try.

@rairai9 sad to hear. But be aware that the experimental 6.3.5 kernel provided by Justin is not equal to the one in the “stable” update repository → in the official updates, you have 6.3.5-200, the
experimental is 6.3.5-201 → just to be on the same page: you tested both?

In case both 6.3.5 kernels didn’t work for you, let’s assume you have maybe something completely different and start from the beginning.

First, even if your screen is black immediately after grub, I would like to check if the root file system is maybe already mounted when the system breaks. That would be very helpful, so let’s give it a try.

Therefore, please start your system with the newest kernel that does not work (I assume this is 6.3.5-200; but do not use the experimental 6.3.5-201 kernel for this!). Now you experience the black screen. Well, let’s just wait a minute at this point. Then, force turn off your machine. Then, boot immediately next the working 6.2.15 kernel. Once up, please get the output of sudo journalctl -r --boot=-1 → this gives us the journal from the respectively last boot. If you want, you can
already compare the system time of the journal output with the time you tried the broken kernel. If the time fits, the root file system was mounted and provides us (hopefully valuable) logs about what happened. In either case, please provide the journal logs. Even if it is only the logs of the last 6.2.15 kernel boot (in case the system breaks with 6.3.5 before the root file system is mounted), it could contain errors that indicate why later kernels break completely. Otherwise, we have to play with grub, but I would be glad if we could avoid this.

Feel free to replace data you consider private in the logs (usernames, IP/MAC addresses, …).

I am about to try this 'experiemntal kernel too this morning. Need to do backups first. Will advise.

Please note that some refer to removing from grub command line kvm_ignore_msrs=1 to solve the boot problem. My /etc/default/grub

GRUB_TIMEOUT=“5”
GRUB_DISTRIBUTOR=“$(sed ‘s, release .*$,g’ /etc/system-release)”
GRUB_DEFAULT=“saved”
GRUB_DISABLE_SUBMENU=“true”
GRUB_TERMINAL_OUTPUT=“console”
GRUB_CMDLINE_LINUX=“rd.lvm.lv=VG01_nvme_pcie/rootfs rd.luks.uuid=luks-160cee22-ab53-47b2-a48a-382fca72928a rhgb rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init”
GRUB_DISABLE_RECOVERY=“true”
GRUB_ENABLE_BLSCFG=“true”

suggest kvm_ignore_msrs=1 it’s not a default setting. About to reboot to check grub menu.

I can confirm my grub2 menu on boot does NOT have kvm_ignore_msrs=1

As per above in this thread just for the hell of it with 6.3.4-201 in place did
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
[sudo] password for robertk:
Generating grub configuration file …
Adding boot menu entry for UEFI Firmware Settings …
done

Trying boot again with rebuild grub.cfg into 6.3.4-201
This made no difference. Still does not boot and gets stuck if auto start at
“Boot Fedora Linux 6.3.4-201 … 38 Workstation Edition”

Did a ctl-alt-del - reboots but post generates a non-normal beep but inspecting UEFI/BIOS no apparent error codes listed.

Reboots fine selecting 6.2.15-300

Created just for this orginal bad kernel sudo journalctl -r --boot=-1 and uploading results to

As per bug report experimental 6.3.5-201 kernel worked re boot. See bug report for detail.

@py0xc3 yes, I tested both the official 6.3.5-200 kernel and the experimental 6.3.5-201 kernel and sadly neither one booted for me.

I followed your instructions by attempting to boot with 6.3.5-200, waiting a few minutes on the black screen, and force turning off my system. I then booted immediately into the working 6.2.15 kernel and pulled up sudo journalctl -r --boot=-1 but the logs only described the previous time kernel 6.2.15 booted earlier and no logs were present from the time I just tried to boot 6.3.5-200.

I am linking to the output from sudo journalctl -r --boot=0 to show what happened for my current boot on kernel 6.2.15. Please let me know if you notice anything that I should address, and thanks again for your help.

Password - Pv&#4FoQ20%1

@rairai9 , your issue is definitely different from that of Robert. I do not split the topic at this point because it is already linked on several other pages, while some comments also refer to both issues we have here (that would be confusing).

I have added comment to your bug report with some minor points for the people there. I hope someone there can make something out of the existing data. After skimming your logs, I cannot see issues that could explain what happens in later kernels. So I assume the issue did not exist in 6.2.X. If the issue has risen with 6.3.X, it should be in good hands in the bugzilla.

However, there is something you could try to maybe get some more data: Currently, the grub “options” you boot your kernel with include “rhgb quiet” → remove these two options from the file of the 6.3.5 kernel, and then try it again. Maybe this leads to some more output. If so, just make a screenshot (with a camera or so). Then let us know here.

Alternatively to changing the options in the grub file in advance, you can also press “E” on the “6.3.5-200” option within the grub menu at startup and then modify the options line at boot time (this is not permanent).

Created attachment 1969412 [details] Journalctl for 6.3.5-test-201 Previous attachment F38-kernel-6.3.5-test-201-Success.txt uploaded the log for journalctl -r boot=-1 which effectively was 6.2.5.

This upload is for the booting $ uname -a Linux earth 6.3.5-201.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jun 1 15:13:15 UTC 2023 x86_64 GNU/Linux

Sorry!

@rkoppelh It seems that the fix for your issue has been already put into the new kernel 6.3.6. There are no further reports necessary. It was only necessary to know if the issue you experienced has been solved by the experimental kernel, which it has if I understood you right. Also, with regards to the bug report, please keep focused on the very issue.

If you want, you can contribute in bodhi to test the new kernel (6.3.6) to ensure that the issue is gone once you update next with dnf:
https://bodhi.fedoraproject.org/updates/FEDORA-2023-ed3bcae7e8
→ if you want to help verifying that the issue is solved, keep watching this page. Once all automated tests are successfully conducted, the status will be set to testing. Then, below the “Reboot required” comment in the top, there will be another line with a dnf command, which will be sudo dnf upgrade --refresh --advisory=<hash value> → then, with this command, you can install the 6.3.6 kernel and test if it boots properly and if the issue is solved as expected. It will work as usual with updates.

If so, you can login to bodhi with your normal Fedora credentials, and at the bottom of the page, you can add your results. You can then see there three options to add results (positive & negative each): Karma, BZ#2211784, regression. Unless you want to enter into regression testing, please leave the regression status neutral. If you “play” with the kernel just by working with it in order to see if everything works fine with it, you can add Karma if you want (Karma = generally working). Most important in your case: BZ#2211784 is the bug you reported. With this option, you confirm that the bug is solved or not. If the kernel works for you without causing the issue you reported, then add here a thumbs up. If it does not solve the issue, thumbs down. Obviously, in your case with BZ#2211784, a positive Karma would imply a positive answer to BZ#2211784 :wink:

Once sufficient people have tested the new kernel (and at the best if some confirmed that the issue BZ#2211784 is solved within that kernel), it will be pushed to stable to then end up in the usual daily updates.

Bodhi - Fedora Project Wiki

You do not need to paste your results here, only in bodhi.

1 Like

@py0xc3 Yes, I can confirm that the issue did not exist in any 6.2.X kernels but has only appeared in 6.3.X kernels. I tried booting into 6.3.5-200 with the “rhgb quiet” options removed from the kernel file and here is the output I saw when attempting to boot the kernel:

 Booting a command list

EFI stub: UEFI Secure Boot is enabled.

When booting the kernel with secure boot disabled, the output is:

 Booting a command list

In both cases, the system displays one of the two messages listed above and hangs indefinitely.

@rkoppelh Glad you found a solution to your issue!

I didn’t do much. Seems that if nvidia-drm.modeset=1 the new kernel does not install drm something or other. 6.3.5 forgot to incorporate that known change.

Sadly looks like your system is very different with same symptoms though I do not run UEFI secure boot. Namely because my MOBO was manf’d during BIOS → UEFI transition period (2012), while Intel UEFI, the secure boot and nvme is it’s weak point. I am willing to bet if I turned on secure boot (not willing to go there) I would have with 6.3.5 the same issue text as you. See nvidia comment in Understanding nvidia-drm.modeset=1 (NVIDIA Linux driver modesetting) - #2 by generix - Linux - NVIDIA Developer Forums

Maybe just for the hell of it on command line try setting to 0 or removing nvidia-drm.modeset=1 if you have it as a wild arse guess? Or if you don’t have it put it in?

My experience with Centos 7 to F36, now 38, is that whenever the system does not boot or has black screen it was always related to nvidia issues. I’ve never seen boot problems due to something else since using linux from 2008. Usually the next kernel fixed it. Though this time was the worst as no dmesg or journal logs. Literally no boot.

I was hopping opensource nvidia, nuveau, wayland would fix these nvidia issues. But I find KDE and wayland are a bit iffy. Xorg with KDE & Nvidia rocksolid. Don’t know about Gnome. I hate that desktop. I wish FEDORA/REDHAT/IBM just bit the bullet and work with Nvidia to fix these persistent annoying niggles permanently. Yes I get the Free software arguments. But you also have to pragmatic, nvidia makes good GPUs, they have a monopoly, i.e. they’ve won and unless someone comes up with competition (AMD tried) which I now doubt, the Linux world is going to have to kiss the ring. Herassy I here you say! Maybe, but it’s the reality.