Fedora not booting, emergency mode after update to kernel 6.5.12 - failed to start initrd-switch-root.service

After updating a Fedora 37 installation and initiating a release upgrade to Fedora 39 using gnome-software, Fedora won’t boot anymore and gets stuck in emergency mode:

Failed to start initrd-switch-root.service

In emergency mode, systemctl status initrd-switch-root shows this error:

Failed to switch root: Specified switch root path ‘/sysroot’ does not seem to be an OS tree. os-release file is missing.


This happens with kernel 6.5.12, which was just installed by the update. Selecting the previous kernel version 6.5.6 in the boot menu works, so the “os-release” file is indeed not missing. After selecting that kernel version, the previously prepared upgrade began and installed kernel version 6.6.11, which also fails to boot showing the same error.

Fedora is installed in an encrypted BTRFS partition and since it worked before the update, it seems like kernel versions > 6.5.6 fail to read or mount the OS partition…

Any ideas how to debug this from within the emergency shell?

Using a Fedora 39 live disk, opening the encrypted partition using “cryptsetup luksOpen” and then mounting it works fine. And in emergency mode, that partition is already decrypted, so it’s possible to mount the block device in /dev/mapper/…, which contains the root filesystem with the OS installation. But /sysroot is not mounted, so it seems like something changed with the update, so that the root filesystem won’t be mounted automatically anymore…

That rdsosreport.txt file might contain more information about what went wrong.

Also, you might want to double-check that the parameters being passed to the kernel are consistent between the old (working) and the new (broken) versions. You can use cat /proc/cmdline (or inspect the entry from GRUB/sd-boot/etc.) to inspect the kernel parameters.

Good idea! Although I have not found anything in that file yet, comparing the boot command options from the GRUB menu show a major difference: The new kernel installations are missing the root=uuid=… option, which should point it to the root filesystem!

So this has nothing to do with the new kernel itself. Instead, it seems like something in the default GRUB configuration was broken, maybe during maintenance (there’s modprobe.blacklist=nouveau which was added to fix graphics issues but that’s not new, it’s in the kernel command line of the old kernel too).

Yeah, that would do it. :slightly_smiling_face:

I think /etc/kernel/cmdline is supposed to be the authoritative source for what parameters get passed to the kernel on more recent versions of Fedora Linux. Older versions used /etc/default/grub.

The contents of /etc/kernel/cmdline should be almost identical to what you see in /proc/cmdline for your working system. /etc/kernel/cmdline should not contain the BOOT_IMAGE=... or initrd=... items from /proc/cmdline (those will be added at runtime by the bootloader).

Once you have /etc/kernel/cmdline (or /etc/default/grub) fixed, you’ll need to run something like kernel-install add 6.6.9-100.fc38.x86_64 /lib/modules/6.6.9-100.fc38.x86_64/vmlinux to re-generate the boot entry (or you could try dnf reinstall kernel-6.6.9-100.fc38.x86_64). (Substitue 6.6.9-100.fc38.x86_64 with whatever kernel version you are having trouble with.)

Thanks for the hints. I manually copied the missing boot options from the old loader file to the new one: /boot/loader/entries/*6.5.6-*.conf vs. /boot/loader/entries/*6.6.11-*.conf

Then I used grubby (doc) to verify the current boot entry:

# grubby --info $(grubby --default-kernel) | grep args
args="ro resume=UUID=...

Side note for all those who vaguely remember some grub2-mkconfig command: It’s not in the global /boot/grub2/grub.cfg file. Instead, that file reads the conf files from the “entries” directory to create the boot menu entries. That’s why the global grub.cfg did not require any modifications, at least in this installation (Fedora 39).

Fixed.
But the question remains: Why was that file broken?

That’s even simpler than re-running the kernel-install script. However, the next time you update your system, you will likely be back at “square one”. One of the last things that happens when a new kernel is installed is a call to that kernel-install script. It, in turn, calls the scripts under /usr/lib/kernel/install.d. One of which is 90-loaderentry.install. That script then sources /etc/kernel/cmdline to get the parameters for the new kernel.

90-loaderentry.install:

...
if [ -n "$KERNEL_INSTALL_CONF_ROOT" ]; then
    if [ -f "$KERNEL_INSTALL_CONF_ROOT/cmdline" ]; then
        BOOT_OPTIONS="$(tr -s "$IFS" ' ' <"$KERNEL_INSTALL_CONF_ROOT/cmdline")"
    fi
elif [ -f /etc/kernel/cmdline ]; then
    BOOT_OPTIONS="$(tr -s "$IFS" ' ' </etc/kernel/cmdline)"
elif [ -f /usr/lib/kernel/cmdline ]; then
    BOOT_OPTIONS="$(tr -s "$IFS" ' ' </usr/lib/kernel/cmdline)"
else
    BOOT_OPTIONS="$(tr -s "$IFS" '\n' </proc/cmdline | grep -ve '^BOOT_IMAGE=' -e '^initrd=' | tr '\n' ' ')"
fi
...

It looks like it has a fallback path where it will try to use /proc/cmdline directly. So it might work even if you don’t fix /etc/kernel/cmdline. But then again, that fallback has obviously failed you before, so I wouldn’t count on it.

Yeah, I know. Forgot to mention that I fixed /etc/default/grub too.

Still wondering what could’ve broken that file though.

Oh, I’m not sure, but I think GRUB switched from using /etc/default/grub to /etc/kernel/cmdline at some point (maybe that was this commit?). There is a comment along the lines of “… afterward, grubby will take care of syncing on updates …”. :person_shrugging:

It might be possible that you could have “missed” the update that made the transition if you, e.g., skipped a Fedora release when updating your system or something like that. I don’t really know.

Edit: And it looks like there is a follow-up commit here that tries to copy things from /etc/default/grub to /etc/kernel/cmdline.

Among other things in that last commit is a rather unusual test – [[ /etc/kernel/cmdline -ot /etc/default/grub ]] – that checks if /etc/kernel/cmdline is older than /etc/default/grub and only copies the contents of /etc/default/grub if /etc/kernel/cmdline is older. So if you did something equivalent to touch /etc/kernel/cmdline at some point when that file didn’t previously exist, then that could have caused the result you ended up with. Those GRUB scripts are really complex. I’m amazed they work as well as they do really. Personally, I gave up on GRUB a long time ago and switched to sd-boot. :wink:

Anytime you run grub2-mkconfig, /etc/kernel/cmdline will be updated with the information found in /etc/default/grub.

Is that true if the user manually updated the values in /etc/kernel/cmdline though? If so, then /etc/kernel/cmdline wouldn’t be the “authoritative” source of information, but rather /etc/default/grub.

Just going from the code in that commit, it looks like it is comparing the timestamps and using whichever file has a newer timestamp. In which case, they are both authoritative in weird sort of way (depending on which one was last edited). But there may be a yet later commit that further alters the logic or there may be other components at play. I cannot keep up with it. :slightly_smiling_face:

That is correct: it takes the information from the newer file. There are no new commit in this area, but it took several commits to make it work.

Another side effect of mkconfig is that the option lines in the bls files are updated.