Intemittent boot failure, new Fedora 34 install

I installed Fedora 34 on a Dell XPS 15 (9570) a few days ago. It suffers intermittent boot failures, but I’m not sure at what point in the sequence of app installations & configuration this started (I don’t reboot often).

Failure symptoms are that the POST screen displays only the Dell logo without the additional Fedora one (which displays on successful boots). It hangs there for perhaps 30s before filling the screen with garbage.

I thus far have been able to get it to boot by retrying repeatedly, 4-10 times, and eventually it works. I’m not a complete Linux newbie, but am rusty and unfamiliar with how UEFI boot works in detail. I’ve become scared to reboot - I use the machine daily for work, and have a lot of user space stuff installed. The intermittency is weird. Everything runs perfectly once booted. The previous installed OS was Windows 10, which has run consistently without any boot issues.

The Fedora 34 install is quite vanilla - I accepted most of the defaults & allowed the installer to partition the entire internal drive (main partition btrfs & unencrypted).

I don’t know how to troubleshoot this and would be grateful for any suggestions. All I’ve done so far is fsck the 3 partitions (from the Fedora 34 live usb), which showed no problems.

Or perhaps if troubleshooting is difficult, would there be a relatively straightforward (or at least well documented) way to reinstall the UEFI & boot partitions, leaving the main one alone?

Partition table is as follows:

Disk /dev/nvme0n1: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors
Disk model: KXG50ZNV1T02 NVMe TOSHIBA 1024GB        
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2D21A182-009F-483E-BAD9-37BD401CC857

Device           Start        End    Sectors   Size Type
/dev/nvme0n1p1    2048    1230847    1228800   600M EFI System
/dev/nvme0n1p2 1230848    3327999    2097152     1G Linux filesystem
/dev/nvme0n1p3 3328000 2000408575 1997080576 952.3G Linux filesystem

Disk /dev/zram0: 8 GiB, 8589934592 bytes, 2097152 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Are there multiple boot loaders configured? If so, perhaps the primary is failing and after x failures it tries a “fallback” boot loader that works?

I think the following command should give you an idea of which UEFI boot loaders are configured on your system:

$ sudo efibootmgr -v

Just one, which is what I would have guessed as I just ran the installer without changing anything

BootCurrent: 0001
Timeout: 2 seconds
BootOrder: 0001
Boot0001* Fedora	HD(1,GPT,24bfed9d-6614-469a-94f0-5c5dc208bd95,0x800,0x12c000)/File(\EFI\fedora\shimx64.efi)

After a bit of experimenting though, I’ve noticed something I hadn’t before (when I was frantically trying to get the machine back up to do some work on!).

In the BIOS, there’s only one entry, “Fedora”, in the Boot Sequence configuration section. The entry is enabled (with a checkbox). So you’d think the machine would boot straight into that. But the actual behaviour (confirmed a couple of times now) is:

  • start the machine from off, or reboot. Leave alone: fails with eventual garbage on screen
  • likewise, but interrupt with F12 to pick boot options, select ‘Fedora’ (which is the only option): success.

So I can boot the machine reliably, but only with the F12 intervention. Odd and not that convenient (nor reassuring).

I remember for some period of time, if I selected a particular boot entry in UEFI setup, it will not boot correctly for me.

What I did is, I disable all boot entries in UEFI then the system will always detect what is available to boot on power up - and it will boot to the correct (in terms of what I am expecting) disk/grub2 menu.

It is fixed when I completely format the whole disk - esp, /boot, /, etc and install Fedora 34 (still Rawhide then) fresh.

I will say it is worth trying to format and recreate the ESP from fresh, especially you seems only have Fedora installed.

I had a look at the (sole) BIOS boot entry, and it looked correct - right disk, partition, and shim file. Odd. Anyway, I tried deleting it. On next boot, the BIOS created 2 entries identical except for a different label. Again, both looked correct, and both failed to boot.

So I can still only boot via the F12 method, which must populate dynamically (because it picks up usb drives).

Perhaps recreating the ESP would be a good idea, I’m not sure. I’m a bit reluctant to risk it as I have an otherwise functional system at this point.

1 Like

Before recreating the ESP, there might be a simpler option if you aren’t using secure boot.

It looks like there is a known bug in the shimx64.efi boot loader:

According to the above bug report, you should be able to work around the problem by configuring your system’s firmware to boot directly to the grubx64.efi boot loader at \EFI\fedora\grubx64.efi.


Good find - that did the trick. I had to turn Secure Boot off, which is fine at least for now.

Thank you.

(What I don’t understand is in this case why I was able to reliably boot from the F12 menu. I guess I can live with that niggle)

According to one of the developers working on the problem, “… This looks very much like the firmware call to HandleProtocol() (shim.c:1104) returned success but gave us back a handle that’s not completely populated. …”. So I guess using F12 causes HandleProtocol() to return a “populated” handle. ¯\_(ツ)_/¯

Could be. Anyway, I’m feeling a bit more secure now that nothing’s fundamentally wrong with my installation. Cheers.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.