F41: Drive not found after upgrade (nvme module not loaded)

After upgrading to Fedora 41 (using dnf system-upgrade), my laptop failed to boot after the upgrade completed.

(As an aside, this is the first “failed” upgrade I’ve had in years!)

I could see that the systemd-modules-load service had failed, and after a few minutes the system timed out and dropped to the emergency shell.

A bit of digging revealed that my SSD wasn’t listed in /dev, and lsmod returned an empty list.

At the emergency shell prompt:

modprobe nvme
systemctl restart systemd-modules-load
[enter LUKS passphrase at prompt]
lvm_scan
mount /dev/mapper/xxxxx /sysroot
mount -a
exit

This let me enter the passphrase for the drive.
I’m not sure that the mount commands where necessary, but it allowed me to finish booting into a normal graphical session and troubleshoot the issue.

After booting and troubleshooting:
The current workaround has been to run:

sudo grubby --update-kernel=ALL --args rd.driver.pre=nvme

This forces the module to load and boot now continues as normal.

The systemd-modules-load service still fail during boot and both it and systemd-udev continue to report the same libkmod error in the logs, so may or may not be relevant.

Troubleshooting / background information:
Probably relevant log entries:

systemd-modules-load[302]: Failed to initialize libkmod context: Operation not supported
systemd-udevd[467]: Failed to initialize libkmod context: Operation not supported

dracut-initqueue[496]: Warning: Could not boot.

The Failed to initialize libkmod messages also appear if I try to run /usr/lib/systemd/systemd-modules-load directly in the emergency shell.

I believe the warning from dracut-initqueue is expected in this scenario.

I have tried recreating the initrd (dracut --regenerate-all --force) before and after updating systemd to the version in updates-testing (systemd 256 (256.8-1.fc41)) with no change.

I also ran restorecon in case this was an SELinux file labelling issue, although ls -laZ in the emergency shell just shows a question mark for contexts, so possibly this is too early in the boot process for SELinux?

Nothing in the logs from the upgrade suggests any errors or problems, and I haven’t found any relevant upstream bugs based on that log message.

I’m happy to take further troubleshooting suggestions to try when I have a chance, and I’ll post a BZ link once I’ve filed it BZ#2326621

1 Like

Maybe running touch /.autorelabel instead of restorecon from the emergency shell, or restorecon after system is booted would have the desired effect.

1 Like

I should have been clearer about the order of things there. restorecon was run from the booted system.
The point about missing contexts when running ls -laZ in the emergency shell was more because I think this is happening too early in the boot process for SELinux to be loaded.

Just tried touch /.autorelabel and letting that process complete after the next reboot, just in case it did something differently. Unfortunately the problem persists.

1 Like