Intramfs not being generated after kernel updates

The last two kernel updates on my f41 box failed to generate the intramfs files in the boot folder.

I had to force the creation manually using dracut, after this the system boots fine.

Is there something I can do to enable/force the creation of the inframfs file after the next kernel update?

Kernel updates call kernel-install add … in the RPM posttrans stage. That, in turn, calls /usr/lib/kernel/install.d/50-dracut.install which then runs dracut -f ....

That 50-dracut.install has this interesting line near the top:

# If KERNEL_INSTALL_MACHINE_ID is defined but empty, BOOT_DIR_ABS is a fake directory.
# In this case, do not create the initrd.
if ! [[ ${KERNEL_INSTALL_MACHINE_ID-x} ]]; then
    exit 0
fi

But that is just one of many possible reasons why the script might be failing. I’d try making a temporary copy of the script somewhere and adding set -x near the top. Then try to run it manually with the right parameters and see if you can find where it is stopping/failing.

Edit: Another (highly unlikely) reason why the initramfs generation could be skipped is if one of the earlier scripts from /usr/lib/kernel/install.d or /etc/kernel/install.d exits with code 77:

$ man kernel-install | grep -m 1 -C 1 77
       An executable placed in these directories should return 0 on success. It may also
       return 77 to cause the whole operation to terminate (executables later in lexical order
       will be skipped).
3 Likes

You can also add the -v option instead, for example

kernel-install -v add 6.12.7-200.fc41.x86_64 /lib/modules/6.12.7-200.fc41.x86_64/vmlinuz
2 Likes

It’s a known bug in dkms that a fix to dkms is currently going through testing.

Until you get the updated dkms you will need to manually build the initrd after each new kernel install.

Interesting, it looks like kernel-install was changed a few years ago so that it fails immediately and does not run any remaining plugins when one plugin in the chain fails: kernel-install: if a plugin fails, return error immediately · systemd/systemd@5aa285b · GitHub

Prior to that commit, kernel-install would run all plugins unless one of them explicitly returned 77. Maybe the documentation should be updated? (Or maybe the change should be reverted? Are the plugins under install.d meant to be inter-dependent?)

A good question perhaps for @zbyszek.

If you don’t want 40-dkms.install to fail the rest of the procedures, you could add exit 0 to the end of the script.

That would work, but I would suggest copying /usr/lib/kernel/install.d/40-dkms.install to /etc/kernel/install.d/40-dkms.install first. Otherwise, the change would be overwritten on the next update.

It has been fixed in dkms itself. 3.1.4-3 is in testing repos and works for me.

Of course the dkms module still fails to build as the vendor has no support for 6.12 yet, but the initramfs is built.

1 Like

I’m pretty sure the the installation should fail if any of the plugins report failure. Creation of an initrd or installation of the kernel is not something where we’d want continue after a partial failure and take our chances. Commit 5aa285b made the failure immediate, before that the program would fail at the end. The plugins can build on one another, so if one fails, we cannot be sure that any of the work done later is at all useful. And we’re going to fail at the end anyway, so doing further work after an error is not particularly helpful.

The documentation seems correct. If you see place for changes, please show where exactly, or even better, just submit a PR, that’s probably going to lead to a quicker solution.

(The whole kernel-install machinery is … ugly, because of history and backwards compatibility. E.g. the “feature” with allowing a special return value to skip further plugins was done to incorporate what a grubby was doing. Both dracut and the kernel-install plugins were in the past written without little thought given to modularity and configurability, and error handling was only sporadic. And also, all modifications were done “in place”, while nowadays we want installation to be atomic and leave the system unchanged if any error occurs, as much as possible.
We try to evolve this system while maintaining backwards compatiblity, so it’s both overcomplicated and fragile. But I’m pretty sure that the changes towards stricter error handling and cleanup-on-failure are in the right direction.)

In this particular case, I think that the only solution is to fix the dkms plugin. Kernel-install cannot and shouldn’t try to paper over a failure in another component.

1 Like

That’s fine. It just appears to be a different philosophy than what the original implementation was using. The old way had a “continue on error and let the user sort it out” methodology. So, e.g., if a video driver failed to build, the system was expected to fall back to a more basic driver (probably simpledrm these days) or otherwise, the user was expected to work around the problem. Under the new all-or-nothing philosophy, a kernel should not have been added to the boot loader since the initramfs failed to build.

The line about needing to exit with code 77 to prevent further plugins from executing needs to be changed to indicate than any non-zero exit code will cause all remaining plugins to be skipped.

That is how it’s supposed to work. Both kernel-install and any plugins distributed by systemd are supposed to leave no permanent changes if any plugin reported an error.

With 77, they are skipped. With a different non-zero return value, the whole operation is aborted. But how exactly that happens is an implementation detail that doesn’t need to be documented.

1 Like

The bls for grub2 is created early on by 20-grub.install whereas the bls for sd-boot is created late by 90-loaderentry.install.

1 Like

There are two types of dkms modules, optional ones and critical ones.
For example GPU drivers would be critical. But VM copy-n-paste, as was my specific case, optional.

I would want to be able to control if a build failure as critical or not, with a default of being critical.

Yeah. The installation is more robust if the entry is created atomically at the end. Kernel-install provides the mechanism to DTRT, but we don’t control the implementation of all plugins.

I’m not sure if this is such a great idea. I think that all failures in this area should be fixed. But if you want this behaviour, or the authors of a plugin want this behaviour, then all they need to do is to print a warning and return 0. The mechanism is available.

1 Like

If you arrange for 90-loaderentry.install to build the bls for grub2 it will work quite nicely – and that would simplify things. One of my systems has been running like that for several years with legacy BIOS boot.

1 Like

Thankful fixed with 6.12.8-200.fc41.x86_64

1 Like

FYI: It’s fix in the dkms package not the kernel packages.

1 Like

I’ve been experiencing this issue since 6.13.7
When running dkms --version I get dkms-3.1.6 which is more recent than 3.1.4-3
Did this issue not end up being fixed in that release, or could there be another cause?

I did have a recent update that broke but I think this was my wireless controller dongle driver that broke it xone. I updated the code base reinstalled the driver and all is fine subsequent updates to 6.13.8 and 6.13.9 were fine.

This has been going on for a while, and last time it happened was today, I have no idea what’s causing it. I’m fully updated and everything. How do I even go about troubleshooting this?