Boot Failure

Fedora 40

  • Executed sudo dnf upgrade from terminal
  • Update hung (think it was on appstream?). Mouse disappeared. Keyboard not responsive.
  • Saw no other choice but to reboot.
  • The first two GRUB entries fail to boot:

I get the fedora logo and spinning arrow. Then a bunch of text scrolls by on the left (too fast to read) and the screen goes black. Waited ~10 minutes; no change.

  • Grub rescue entry results in this:

Almost every time I upgrade Fedora, I run into some kind of boot problem…

I have ZERO clue how (or if) to fix this. Suggestions?

Have you tried to steps in root account locked?

This is not normal. You may have a hardware problem. You may also have new issues left from previous boot failures. The journalctl output should help identify the problem. Some vendors provide hardware tests, and the Fedora Live Workstation USB has memtest86+.

How robust is your network connection? If you are having problems downloading packages you may need to use dnf update --downloadonly before running dnf update.

Added dnf, f40, shutdown

I will attempt your suggestions.

Doubt it’s a hardware problem. This is a dual boot machine with a fast, reliable Internet connection. Never had any problems on the Windows side. Never had any problems on the Fedora side until upgrading from F36 to F40. So far this year, I have done 4 clean installs of F40. About 50% of the time, updates are problematic: dnf crashes, apps stop working after update and have to be reinstalled, etc. I’ve been using Linux since BSD first came out, worked with numerous distros, and never experienced such an unstable release.

Could not complete steps suggested. Unable to mount mnt/boot/efi (step 9).

Saw slightly different error on rescue entry:

If this is not resolved by tomorrow, I’ll be forced to choose a different distro. Simply can not waste this much time with Fedora’s instability…

Installs are the most intense activity in the lifetime of most storage devices, so you are most likely to encounter failures after an install. Multiple installs are very likely to push an older device into failure, and btrfs detects data corruption problems that other filesystems ignore. Use Gnome Disks to check the “Health” of the system drive and run the internal tests. Some vendors provide bootable images for drive tests.

The F40 filesystems need repairs before you can mount /boot/efi/, and possibly other partitions. This often happens (with any linux distros) after an unsafe shutdown.

You can try mounting the partitions on the system disk from the command-line in the Live Installer to see what errors are reported. You can then try to make the partitions mountable

FUBAR. Waste of time to attempt to recover this installation…

Reviewing my notes, every single system corruption experienced is the result on running dnf. Outside of this consistent fault, Fedora is excellent and remains my favorite distro; hands down.

Ubuntu is out due to snaps. Debian is dubious due to slow app updates. Mint has some strengths, but has never been a favorite.

With trepidation, I’ll go with F40 again. However, I will look into disk image backups and will use yum (not dnf) for updates.

Many thanks for your help!

I have a Pull request in Pagure to update this documentation as it is hard to follow. Although no luck in hearing from the team on it.

There is no yum, dnf is the update tool in Fedora. If you are still having issues I can guide you through these steps to unlock the root if you have a Live USB available. From there we can also inspect the update on the machine.

1 Like

Yes, just discovered that about yum.

IME, much faster and more reliable to simply perform a clean install, than to attempt a repair. Never keep data on the system drive, so it’s not much of an issue to start clean. Already done.

Fedora fan since F28. However, starting with F39 and F40, have had chronic problems with dnf crashing, sometimes resulting in corruption.

Hopefully, this issue will be corrected soon…

1 Like

When you do, I will suggest you to assign a password to root. sudo passwd root. Thus if the boot hangs again, you can enter the root password and then run the journalctl to find out what went wrong. That you seem to have problems with dnf updates, is not normal.

1 Like

Thanks for the tip! Done. However, in this case, would that have helped given that the system console was unavailable to login?

In the picture from message 5, instead of having root locked, you can now enter the root password, and then do further analysis. You can’t fix things if you don’t know what the real problem is.

It would have given you access, since it was asking you to enter the root password.

Dnf works well for many users, but also for many users, dnf is responsible for the majority of storage operations, so if the storage device is failing, dnf may appear to be unreliable.

Although SMART indicated no problems (other than a large number of error log entries), replaced the system SSD. Still observed intermittent issues with DNF.

However, ran MemTest86 and hundreds of memory errors were reported. Replaced system memory and, so far, no issues with DNF. Also, app crashes have virtually disappeared.

Never had memory failures before, but now it will be one of the first things I test…

1 Like

So the dnf issues can be traced back to a failing memory :thinking: Interesting.

Could you place the make/model of your memory here? Would serve for good use in the future to potentially avoid those models.

Kingston Hyper X Fury

I’ve been fairly brand loyal to Kingston for over 20 years. Until now, NEVER had a failure, which I can’t say for other brands I’ve tried.

This memory is over 7 years old and has seen a lot of number-crunching stress. Still definitely consider Kingston to be a reliable brand.

1 Like

I had a pair of Kingston Hyper X Fury for a Z97-A motherboard. Currently in a storage unit. . .

The system SSD (Samsung) mentioned was replaced with a Kingston SSD, so I see no reason to give up on them yet…

This does not 100% verify the source of memory failure. There are a lot of contacts on the memory (240+) and even one that corrodes or otherwise becomes interrupted (even intermittent) can cause errors in memory.

My first action would be to remove the dimms and clean all the contacts with an eraser then reseat them and test again. Also using air to blow out any potential dust/lint from the socket while the dimm is removed.

Contacts seem the most common failure with memory, though the chips do sometimes fail.

1 Like