I could do with some help troubleshooting an issue I encountered after I performed dnf system-upgrade
, upgrading from 37 to 39 today. The system no longer boots due to missing MD RAID devices.
I already tried booting into an old kernel (f37) still present on the system. The result is the same. I assume that tells me the issue is not with f39, but with the MD RAID itself.
I have three disks in the system. The disks are partitioned identical and RAID is setup using partitions.
sda
|-- sda1 swap
|-- sda2 linux_raid_member
|-- sda3 linux_raid_member
|-- sda4 linux_raid_member
The remaining SATA drives are the same. The MD devices are /dev/md1
, /dev/md5
and /dev/md54
, where:
/dev/md1 raid1 sda3[S] sdb3 sdc3
/dev/md5 raid5 sda2 sdb2 sdc2
/dev/md54 raid5 sda4 sdb4 sdc4
While /dev/md54
is assembled just fine, the other two are not. Examining the members with mdadm --examine
I see all the information regarding the RAID devices. One difference I noticed (not sure if that matters), /dev/md54
is using metadata version 1.2, while the two missing devices use version 1.1.
During boot I’m thrown into recovery mode. After doing some investigation I tried to assemble /dev/md5
myself. First I tried with mdadm --assemble --scan --no-degraded
. A line is printed telling me the array has been assembled. However, at the same time the console freezes. Input is no longer possible. CTRL+C
does not return me to the prompt. No further information is printed and I don’t see any disk activity looking at the disk LEDs. CTRL+ALT+DEL
lets me reboot the system, though.
In another attempt I tried assembling with mdadm --assemble /dev/md5 /dev/sda2 /dev/sdb2 /dev/sdb3
. The result was the same. After the message regarding assembly the system freezes.
Notably, the root partition is not on any of the MD RAID devices. It resides on a small SSD. So, I suppose I could edit /etc/fstab
and comment out a bunch of lines, create a temporary /home
and get to boot at least into the desktop.
Some more background for completeness’ sake. One of the disks was replaced recently after failure. The new disk is larger. But the partition layout (copied using dd
from another drive) is the same. The RAID arrays were rebuilt successfully after that and raid-check
ran two nights ago not finding any issues.
My questions are:
- Has anyone else experienced a system freeze attempting manual assembly using mdadm? If so, what was the cause.
- What should I do next in an attempt to recover from the situation?