Need Help: F40 Everything "inst.sdboot" (or kickstart `bootloader --sdboot`) consistently fails

Hello,

I’ve been trying to install Fedora with systemd-boot through the latest (nightly) F40 Everything iso.
Sadly every attempt has failed, with not much useful information.
More specifically, the error produced is:

The following error occurred while installing the payload. This is a fatal error and installation will be aborted.

An error occurred during the transaction: Error in POSTTRANS scriptlet in rpm package kernel-core

With the error being fatal (exiting the installation) and me being pretty new to Fedora, I don’t know if there are more thorough logs I could somehow access for additional information.

I have tried:

  • both inst.sdboot argument, and with kickstart with bootloader --sdboot
  • Gnome Boxes (UEFI mode)
  • bare metal (full UEFI mode, no legacy BIOS settings, used systemd-boot before on same system under other distribution)

Note: I’ve only started using kickstart yesterday in order to have some reproducibility in my tests, so keep an eye out for possible mistakes due to lack of experience

Here is a pretty minimal kickstart sample that consistently fails for me for all above scenarios:

# F40 sdboot failure with Err:
# 
# The following error occurred while installing the payload.
# This is a fatal error and installation will be aborted.
# An error occurred during the transaction: Error in POSTTRANS scriptlet in rpm package kernel-core

graphical
keyboard us
lang en_US.UTF-8
timezone US/Eastern

# Use network installation
url --mirrorlist="https://mirrors.fedoraproject.org/mirrorlist?repo=fedora-$releasever&arch=$basearch"

selinux --enforcing
shutdown

bootloader --sdboot --timeout 0
# Partition clearing information
# !!! TESTER CAUTION !!! :
# next commands to be issued will erase all visible disks' data
# This setup assumes a single drive where the system should reside (eg for usage in a simple VM).
# Edit commands accordingly for testing on different setups.
zerombr
clearpart --all --initlabel --disklabel=gpt
autopart

# !!! TESTER CAUTION !!!:
# plaintext passwords ahead,
# suitable for testing purposes ONLY
rootpw --plaintext testsdb
user --groups=wheel --name=user --plaintext --password=testsdb

%packages
@^kde-desktop-environment
%end

After many tests, I tried among other things a manual partitioning as such:

# kickstart partitioning:

zerombr
clearpart --all --initlabel --disklabel=gpt

part /boot/efi --fstype="efi" --size=1024
part /boot --fstype="ext4" --size=512 --label="BOOT"
part / --fstype="xfs" --size=4096 --grow --label="ROOT"

This seems to complete the installation in my test (note: the system in Gnome Boxes is still unbootable until the vm device is configured for disabled secure boot, but after that it works as expected).

This makes me believe it might be an autopart issue, or a btrfs issue.
However with such minimal logging I wasn’t able to narrow it down for sure.
My experience with btrfs is pretty limited so I couldn’t come up with a manual correct btrfs kickstart partitioning to test.

Any advice for troubleshooting would be highly appreciated.


Edit:
Additional tests info:

  • I’ve managed to successfully complete an F39 install with inst.sdboot and no kickstart.
  • On Rawhide I had the same failures however.

I ran the F40 everything media last week and it worked fine, so maybe something has broken in the meantime.

But to be clear secure boot must be turned off for the time being because fedora’s systemd-boot is not yet signed.

I’ve been trying since at least the nightly of 20240319 so I doubt that’s the issue.

My guess is that it is device/setup specific, and my biggest frustration is that the Error msg is just very vague and (barely useful at all, really) and I’m not aware of any way to access more thorough logs.

I’m currently downloading isos to test Rawhide and F39, but I’m pretty sure I’ve had the same issue on F39 stable in the past (just didn’t have time to dig deeper at the time and just installed with grub).

(This is only relevant for my VM tests. Bare metal should be UEFI anyway
Edit: what I meant to say was on bare metal SecureBoot is explicitly disabled in advance, I just don’t know how to disable it before VM creation on Gnome Boxes)

Correct me if I’m wrong, but that is only true for booting the system, not installing it, right?

Because as I already said, I was successful at installing a system with an explicit manual partition scheme with xfs.
It’s just that after the system has been set up, the BIOS will refuse to “hand over” to systemd-boot because its not signed. As soon as the Secure Boot is disabled though, the initiation happens perfectly fine.

It works for me using Fedora 40 Beta with the following kickstart instructions:

clearpart --all
part --fstype efi --size 1024 /boot/efi
part --fstype ext4 --grow /
bootloader --sdboot

I suspect that it will work for me too with ext4 (worked with xfs too).

Have you tried with either autopart or a manual btrfs partition scheme? In the latter case, would you be so kind to share that kickstart scheme?

As I understand it the fact that systemd-boot cannot be used with secure boot should be common knowledge.
That is clearly documented in this thread and the linked documents.

The arch wiki gives a means of signing systemd-boot, but that is not yet available on fedora
https://wiki.archlinux.org/title/systemd-boot

During the install, switch to a virtual console with ctrl-fX, and look in /tmp for the logs. There should be a storage.log and an anaconda.log with interesting information, whether that helps…

1 Like

Although I see a failure during kernel-core install with 51-dracut-rescue.install that looks suspicious as its accessing a loader entries file that doesn’t exist because its not grub…

From Ask Fedora to Project Discussion

Added quality-team and removed gnome, kde, selinux

After accessing the logs (thanks for pointing me to the right direction for this btw, I was having trouble with Ctrl+FX in Gnome Boxes but managed to access through the virtual keyboard), I’m guessing you mean:

/tmp/packaging.log: /usr/lib/kernel/install.d/51-dracut-rescue.install failed with exit status 2.

This is the only thing I found so far too, except a couple of Errors when trying to access mirrors.

Context: This ^ is on the F40 Beta 1.10 iso


Btw (I’ll also edit the original post to add the info), I’ve managed to successfully complete an F39 install with inst.sdboot and no kickstart.

On a test on Rawhide I had the same failures however.

Although humorously, the one way it does continue to work is when the /boot partition isn’t created, only a /boot/efi ESP. The one way that people keep telling me isn’t ideal, despite it making the most sense.

Right, so the latest version of systemd is insisting that a /boot mount point is a XBOOTLDR partion despite it not being accessible from the firmware. So this can be worked around by simply deleting the /boot partition for the time being. Which isn’t a bad plan in general for systemd-boot systems as its just redundant if the ESP is large enough.

https://bugzilla.redhat.com/show_bug.cgi?id=2271674

Its probably fixable by updating sdubby, or maybe anaconda to drop a config file hard coding the BOOT_ROOT.

I pushed a sdubby change to add an install.conf file to force kernel-install to do the right thing. It should work its way through to rawhide in the near future, along with F40.

1 Like

Thank you Jeremy, I really appreciate you jumping on this so fast, providing all the useful insight and pushing a fix.

I’ll make sure to test again a nightly Everything after sdubby-1.0-8.fc40 lands, and use your suggested deletion of /boot in the meantime for my VMs.

It is actually anaconda (or more precisely blivet) that insist that /boot should be a XBOOTLDR partion.

Just a quick update on this on my end:

I tested a new VM install with a Rawhide nightly the other day and the installation succeeded with sdubby-1.0-8.fc41. So I’m glad that there were no further device-specific issues and the issue seems to indeed be resolved with the changes to sdubby.

I’m only waiting for the package to land on F40 stable state repo before marking this as resolved, since (my understanding is) new Everything installs initially pull from stable, so a test on F40 Everything install (nightly 20240328) still failed with sdubby-1.0-8.fc40 in updates-testing.

Installation works as expected when using the latest RC1.13 iso.
This can be marked as resolved (I don’t see an option to do so myself anymore, not sure why).


P.S:

Keep in mind some of the mirrors might be out-of-date, which led a couple of install attempts to fail over the past few days since sdubby was stuck on 1.0-7 even though 1.0-8 was pushed on Fedora 40 stable according to packages.fedoraproject.

For my last few attempts I’ve been appending &protocol=https&country=IE,DE,NL to the mirror metalink which resulted in a mirrorlist that enabled the installation to succeed.