Repair GRUB *and* boot partition after partial disk rewrite

The Issue

Despite attempts at reinstalling and remaking the config, grub won’t boot Fedora and instead goes to the command line. I can manually boot from the GRUB command line (that was a trip to learn how to do…), but then systemd fails to start initrd-switch-root.

This is a default Fedora Server install with a 3 partition disk, where the first partition is /boot/efi, the second partition is /boot (xfs), and the third partition is / (LVM volume group called fedora with a single logical volume called root, so /dev/mapper/fedora-root). It is Fedora 40 though it wasn’t installed as that version. There’s definitely some shenanigans I did though so check the next heading for more details.

[FAILED] Failed to start initrd-switch-root.service - Switch Root.
See 'systemctl status initrd-switch-root.service' for details.
[    4.849253] random: crng init done

Generating "/run/initramfs/rdsosreport.txt"

Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.

The Backstory

I have this server running on a Dell Poweredge with 16 drive bays (14 filled). Originally I had RAID 1 pairs that were combined into a single logical volume in the original install. I wanted to change it to RAID 6 however, so I backed up my data, and then to try and make the setup process easier I made a copy of the root partition (/dev/mapper/fedora-root) using dd. I do not remember the exact process I took to get this, but I made it in such a state where I could still mount the raw image dd created in case restore failed.

Unfortunately I did not think to also just make a copy of /dev/sda1 and /dev/sda2 (/boot/efi and /boot), which would have been trivial to do…

Anyway, my idea was to recreate the partitions once the disks were re-RAIDed by starting a new Fedora install. This did in fact create the 3 partitions successfully though I seem to recall some other strangeness going on where I ended up just recreating them myself, and copying the data that the installer made into the partitions I made.

So we’re at the point where I have what should have been (at some point) a working /boot/efi and /boot for Fedora server 39 (the only ISO I had on hand), and then I restored my root logical volume and verified the files were there (everything looked fine). Then, I knew I would have to remake the grub files since I was trying to load a different kernel and whatnot.

Long story short I have booted into the system with the repair mode from the live ISO a couple times (this time the correct F40 live ISO) and have both (re?)installed neccessary shim- and grub- packages, as well as rebuilt with grub2-mkconfig. At some point I tried grub2-install as well. I even did something with dracut which is a tool I’m not familiar with.

Either way when the system reboots it drops me into GRUB’s command line as detailed in the first section. While the GRUB thing bothers me it seems to not be the primary issue (unless I’m using the command line wrong) since the system does begin to boot, but fails to start a certain service. I tried looking up the error but everything I found just involved rebuilding the GRUB config or initramfs, all of which I seem to have already done without making a difference. This system is so close to being bootable again…

Misc

I don’t have easy physical access to this machine. I get to work on it once or twice a week, and only for a few hours at a time. I can get data/files that would help (such as was suggested by systemd), but it may take me several days to be able to get those.

Also, the post title is based on the fact that I have reason to believe the files in /boot may have issues, but I could be wrong.

It is likely that your root partition did not get mounted to the /sysroot mountpoint (in the initramfs). If you can manually mount what should end up being / at /sysroot, then you should be able to exit the rescue shell and the boot should finish.

There are any number of reasons why the system might have failed to mount /sysroot. You might be missing the need filesystem drivers in your initramfs. (I don’t think XFS is built-in/bundled with the kernel. You might have to specify that you want that driver included in your initramfs image when you run dracut to build your initramfs.) It is also possible that you are missing some needed kernel parameters to activate an LVM volume. Or you might be missing the LVM management utilities in your initramfs.

You might want to look into configuring Serial over LAN. I’ve been in similar situations before where I wanted to rollback a rootfs snapshot remotely. With Serial over LAN configured and an SSH jump host, you can remotely monitor and interact with the console – including the boot menu and dracut emergency shell.