F40 kernel fails to boot due to device-mapper after LUKS2 password

Problem description

The boot process halts on Fedora 40 after the user enters the disk decryption password.

Note: Some details are redacted ([...], [partition device name], etc).

Boot logs

The following is a transcription of logs displayed on-screen during the boot sequence:

Please enter passphrase for disk [...] (luks-[...]):
device-mapper: table: 253:0 crypt: unknown target type
device-mapper: ioctl: error adding target to table
[FAILED] Failed to start systemd-cryptsetup@luks\x[...]
[DEPEND] Dependency failed for cryptsetup.target - Local Encrypted Volumes
[  OK  ] Reached target sysinit.target - System Initialization.
[  OK  ] Reached target basic.target - Basic System.

Affected kernels

The following kernels are affected and will not boot:

  • 6.8.10-300.fc40.x86_64 (fc40)
  • 6.8.11-300.fc40.x86_64 (latest)

The following kernel is not affected:

  • 6.8.10-200.fc39.x86_64 (fc39)

Encrypted partition details

After a boot from the unaffected kernel, the block device (luks-[...]/253:0) mentioned in the above logs was found in the following files and command outputs:

The first encrypted partition matches the block device by UUID. From /etc/crypttab (line #1):

luks-[...] UUID=[...] none discard

A match between the UUID and device numbers. From lsblk:

├─[partition device name]             259:7    0  [size]  0 part  
│ └─luks-[...]
│                                     253:0    0  [size]  0 crypt /home

Encryption parameters of the block device. From cryptsetup status /dev/mapper/luks-[...]:

/dev/mapper/luks-[...] is active and is in use.
  type:    LUKS2
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: keyring
  device:  /dev/[partition device name]
  sector size:  512
  mode:    read/write
  flags:   discards 

Investigating ramdisk contents

The following command output describes the initial ramdisks for the (fc39) and (fc40) kernels. The output was produced using lsinitrd.

This output establishes the presence of modules enabling dm-crypt device-mapper encryption in either ramdisk. Only relevant parts of the command output are included.

The (fc39) kernel

Image: /boot/initramfs-6.8.10-200.fc39.x86_64.img: 56M
Version: dracut-059-16.fc39
Arguments:  -f

dracut modules:
crypt
dm

-rw-r--r--   1 root     root        28444 Nov 16  2023 usr/lib/modules/6.8.10-200.fc39.x86_64/kernel/drivers/md/dm-crypt.ko.xz

The (fc40) kernel

Image: /boot/initramfs-6.8.10-300.fc40.x86_64.img: 48M
Version: dracut-101-1.fc40
Arguments:  -f --kernel-image '/lib/modules/6.8.10-300.fc40.x86_64/vmlinuz' --kver '6.8.10-300.fc40.x86_64'

dracut modules:
crypt
dm

-rw-r--r--   1 root     root        28316 May 16 [time] usr/lib/modules/6.8.10-300.fc40.x86_64/kernel/drivers/md/dm-crypt.ko.xz

Note: The size discrepancy between the (fc39) and (fc40) ramdisks seems to be a result of certain modules being excluded from the (fc40) kernel ramdisk, such as bash, dbus, and network-related kernel modules.

Detailed ramdisk comparison

The following is a comparison of the file names present in the (fc39) and (fc40) kernel ramdisks. The comparison only targets file names containing crypt. To prevent false positives, file paths were sanitized as not to contain the kernel version.

Experiments

The following experiments were run to further diagnose the issue:

Booting to initrd shell

To query loaded kernel modules, a boot into the initrd shell was attempted using the rd.shell flag on the (latest) kernel in GRUB. The boot sequence was not affected by this flag.

Booting to emergency shell

To query loaded kernel modules, a boot into systemd’s emergency shell was attempted twice, using the systemd.unit=emergency.target and systemd.unit=rescue.target kernel flags on the (latest) kernel in GRUB. The boot sequence was not affected by these flags.

Mitigations

The following mitigations were put in place:

Kernel rotation

To prevent the unaffected kernels from being rotated out, the following edit was made to /etc/dnf/dnf.conf:

installonly_limit=0
1 Like

Can you always reproduce it that way? So, each time you boot the fc39 kernel it works, and each time you choose a fc40 kernel it breaks? So that you can turn that behavior on and off as often as you want?

I ask so obtrusive because I have an issue that COULD have the behavior you describe if it would affect the root device (in my case, it is a different device).

But that issue is not kernel related but because of a recent update of selinux-policy. In my case, the latter is the constant in the problem, and not the kernel.

(my bug report: 2292191 – 40.22-1.fc40 breaks cryptsetup action on startup: related denial logged during boot since update to 40.22-1 -> related device no longer mounted on boot )

If you answer my above questions all with yes, then this is not related and thus a different issue. But I wanted to be sure :wink:

Added kernel

Can you always reproduce it that way? So, each time you boot the fc39 kernel it works, and each time you choose a fc40 kernel it breaks?

That is indeed the case.

1 Like

Leaving this link here to explore using rd.break later

Ok, then forget the case I mentioned: it is then not related.

I added kernel to your topic to make relevant people aware of the case.

I suggest to try the new kernel, since we are currently in the transition to 6.9 → 6.9.4-200

Maybe we can solve the topic by updating to the next (it also contains fixes):
sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-8c4744962d

Keep in mind that this kernel is currently in testing. This means, all automated testings have passed (which usually leaves small space for problems), and a lot of people have already tested the kernel, but it is still suggested to backup your stuff before testing it, just to get max security.

If you run the above command on your system, it will get all related and necessary packages to install the new kernel. Then reboot and see if it solves your issue.

Maybe your issue was already identified and fixed in the new kernel, or obsoleted by some changes (it contains a lot).

If you want, you can also add related karma to bodhi once you tested the new one:
https://bodhi.fedoraproject.org/updates/FEDORA-2024-8c4744962d

I’ve entered the initrd shell on (latest) using rd.break=initqueue on the command-line as any later breakpoint triggered the password prompt for the dm mount.

Results: dm_crypt does not appear in the 35 modules listed by /proc/modules. The module(s) are not blacklisted by any modprobe configuration files in /etc/modprobe.d/ or /usr/lib/modprobe.d. The insmod and modprobe executables are not in PATH, i.e. {/usr,/usr/local}{/bin,/sbin}, either.

Given the same process on (fc39), the module list is also missing the dm_crypt module(s). However, the executables /usr/sbin/insmod and /usr/sbin/modprobe were found on the system.

The modprobe -D dm_crypt command identified the path of the module(s) at /usr/lib/modules/6.8.10-200.fc39.x86_64/kernel/drivers/md/dm-crypt.ko.xz. The modprobe dm_crypt command can then be used to successfully load the module(s), as evidenced by new entries in /proc/modules.

Investigating further, the lsinitrd output on the (fc39) and (fc40) kernels shows the following:

Both (fc39) and (fc40)'s ramdisks contain the /usr/bin/kmod executable.

However, only the (fc39) ramdisk contains the symlinks /usr/sbin/{depmod,insmod,lsmod,modinfo,modprobe,rmmod} targeting the aforementioned /usr/bin/kmod executable.

I was able to mount and decrypt the root filesystem from (latest) using the following steps:

  • Boot into (latest) and drop into the dracut shell using rd.break=initqueue
  • Symlink /usr/sbin/modprobe to /usr/bin/kmod
  • Run modprobe dm_crypt
  • Run systemctl start crypsetup.target and enter the decryption password
  • Mount the resulting /dev/mapper/luks-(...) device to a new /run/root directory, using mount with options to -o matching the rootflags=... kernel command-line argument

I have verified that the /run/root/ directory contains my root file system, as expected. However, assuming the next step is to change the root directory to /run/root and run init, I have been unable to find the chroot or switch_root commands. Help here is appreciated, as well as a long-term fix.

This 6.9.4-200 kernel has since been released. The issue persits – my analysis confirms the same symlinks are missing.

This seems like an error in your dracut instllation.
According to /usr/lib/dracut/modules.d/00systemd/module-setup.sh the following commands (among a lot more) should be installed in the initram file system

        journalctl systemctl \
        echo swapoff \
        kmod insmod rmmod modprobe modinfo depmod lsmod \
        mount umount reboot poweroff \
        systemd-run systemd-escape \
        systemd-cgls systemd-tmpfiles \
        systemd-ask-password systemd-tty-ask-password-agent \
        /etc/udev/udev.hwdb
1 Like

I’m running into similar issues (posted here without a single answer). I checked that file and it looks exactly like what you posted. There is already kernel 6.9.8 out but the error persists for me.

The problem has been solved for me, apparently with the recent dracut 102 update.

Booting with the 6.9.9-200.fc40.x86_64 kernel, the boot process behaves as expected following LUKS password entry, so I’m reverting my installonly_limit config value and booting with the newest kernels from here on out.