I have been using raid 5 with mdadm on a server for many years.
Somehow with a recent update the system now refuses to boot and it seems the failure is caused by not enabling the raid array.
I can boot with the f41 live media, either the original with the 6.11.4 kernel or the respin for today (both workstation) and it sees the raid array and assembles it for access.
I have 7 different kernels installed and none of the installed kernels will activate the raid array during boot. I have no idea what happened nor where to look for the errors. The /run/initramfs/rdsosreport.txt file does not give many hints either.
Perusing that file I see this first so it is apparent that the raid devices are recognized.
/dev/sdd: UUID="c66f241a-545c-e2fa-d200-68419423bfe0" UUID_SUB="e55d4d3b-088c-9983-68d6-14d55c55c089" LABEL="eagle.home.domain:fedora_raid" TYPE="linux_raid_member"
/dev/sdb: UUID="c66f241a-545c-e2fa-d200-68419423bfe0" UUID_SUB="f8808c9e-e414-d28d-6665-af003b9cb8cd" LABEL="eagle.home.domain:fedora_raid" TYPE="linux_raid_member"
/dev/sdc: UUID="c66f241a-545c-e2fa-d200-68419423bfe0" UUID_SUB="070cc792-735b-441d-6df1-b863cc8023ff" LABEL="eagle.home.domain:fedora_raid" TYPE="linux_raid_member"
/dev/sda: UUID="c66f241a-545c-e2fa-d200-68419423bfe0" UUID_SUB="be5831bc-f771-e94b-bce4-1ddcde32ac4b" LABEL="eagle.home.domain:fedora_raid" TYPE="linux_raid_member"
The following stanza are repeated for each of those 4 drives.
P: /devices/pci0000:00/0000:00:01.2/0000:02:00.1/ata1/host0/target0:0:0/0:0:0:0/block/sda
M: sda
U: block
T: disk
D: b 8:0
N: sda
L: 0
S: disk/by-id/ata-WDC_WD30EZRZ-00GXCB0_WD-WCC7K6EU0Y70
S: disk/by-diskseq/1
S: disk/by-path/pci-0000:02:00.1-ata-1.0
S: disk/by-id/wwn-0x50014ee20e58c2bc
S: disk/by-path/pci-0000:02:00.1-ata-1
Q: 1
E: DEVPATH=/devices/pci0000:00/0000:00:01.2/0000:02:00.1/ata1/host0/target0:0:0/0:0:0:0/block/sda
E: SUBSYSTEM=block
E: DEVNAME=/dev/sda
E: DEVTYPE=disk
E: DISKSEQ=1
E: MAJOR=8
E: MINOR=0
E: USEC_INITIALIZED=3728515
E: ID_ATA=1
E: ID_TYPE=disk
E: ID_BUS=ata
E: ID_MODEL=WDC_WD30EZRZ-00GXCB0
E: ID_MODEL_ENC=WDC\x20WD30EZRZ-00GXCB0\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
E: ID_REVISION=80.00A80
E: ID_SERIAL=WDC_WD30EZRZ-00GXCB0_WD-WCC7K6EU0Y70
E: ID_SERIAL_SHORT=WD-WCC7K6EU0Y70
E: ID_ATA_WRITE_CACHE=1
E: ID_ATA_WRITE_CACHE_ENABLED=1
E: ID_ATA_FEATURE_SET_HPA=1
E: ID_ATA_FEATURE_SET_HPA_ENABLED=1
E: ID_ATA_FEATURE_SET_PM=1
E: ID_ATA_FEATURE_SET_PM_ENABLED=1
E: ID_ATA_FEATURE_SET_SECURITY=1
E: ID_ATA_FEATURE_SET_SECURITY_ENABLED=0
E: ID_ATA_FEATURE_SET_SECURITY_ERASE_UNIT_MIN=65906
E: ID_ATA_FEATURE_SET_SECURITY_ENHANCED_ERASE_UNIT_MIN=65906
E: ID_ATA_FEATURE_SET_SECURITY_FROZEN=1
E: ID_ATA_FEATURE_SET_SMART=1
E: ID_ATA_FEATURE_SET_SMART_ENABLED=1
E: ID_ATA_FEATURE_SET_PUIS=1
E: ID_ATA_FEATURE_SET_PUIS_ENABLED=0
E: ID_ATA_DOWNLOAD_MICROCODE=1
E: ID_ATA_SATA=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1
E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1
E: ID_ATA_ROTATION_RATE_RPM=5400
E: ID_WWN=0x50014ee20e58c2bc
E: ID_WWN_WITH_EXTENSION=0x50014ee20e58c2bc
E: ID_ATA_PERIPHERAL_DEVICE_TYPE=0
E: ID_PATH=pci-0000:02:00.1-ata-1.0
E: ID_PATH_TAG=pci-0000_02_00_1-ata-1_0
E: ID_PATH_ATA_COMPAT=pci-0000:02:00.1-ata-1
E: ID_FS_UUID=c66f241a-545c-e2fa-d200-68419423bfe0
E: ID_FS_UUID_ENC=c66f241a-545c-e2fa-d200-68419423bfe0
E: ID_FS_UUID_SUB=be5831bc-f771-e94b-bce4-1ddcde32ac4b
E: ID_FS_UUID_SUB_ENC=be5831bc-f771-e94b-bce4-1ddcde32ac4b
E: ID_FS_LABEL=eagle.home.domain:fedora_raid
E: ID_FS_LABEL_ENC=eagle.home.domain:fedora_raid
E: ID_FS_VERSION=1.2
E: ID_FS_TYPE=linux_raid_member
E: ID_FS_USAGE=raid
E: DEVLINKS=/dev/disk/by-id/ata-WDC_WD30EZRZ-00GXCB0_WD-WCC7K6EU0Y70 /dev/disk/by-diskseq/1 /dev/disk/by-path/pci-0000:02:00.1-ata-1.0 /dev/disk/by-id/wwn-0x50014ee20e58c2bc /dev/disk/by-path/pci-0000:02:00.1-ata-1
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:
+ ls -l /dev/disk/by-diskseq /dev/disk/by-id /dev/disk/by-label /dev/disk/by-partlabel /dev/disk/by-partuuid /dev/disk/by-path /dev/disk/by-uuid
/dev/disk/by-diskseq:
total 0
lrwxrwxrwx 1 root root 9 Feb 14 19:17 1 -> ../../sda
lrwxrwxrwx 1 root root 9 Feb 14 19:17 2 -> ../../sdb
lrwxrwxrwx 1 root root 9 Feb 14 19:17 3 -> ../../sdc
lrwxrwxrwx 1 root root 9 Feb 14 19:17 5 -> ../../sr0
lrwxrwxrwx 1 root root 9 Feb 14 19:17 6 -> ../../sdd
lrwxrwxrwx 1 root root 13 Feb 14 19:17 7 -> ../../nvme0n1
lrwxrwxrwx 1 root root 15 Feb 14 19:17 7-part1 -> ../../nvme0n1p1
lrwxrwxrwx 1 root root 15 Feb 14 19:17 7-part2 -> ../../nvme0n1p2
lrwxrwxrwx 1 root root 15 Feb 14 19:17 7-part3 -> ../../nvme0n1p3
lrwxrwxrwx 1 root root 9 Feb 14 19:17 9 -> ../../sde
lrwxrwxrwx 1 root root 10 Feb 14 19:17 9-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Feb 14 19:17 9-part2 -> ../../sde2
lrwxrwxrwx 1 root root 10 Feb 14 19:17 9-part3 -> ../../sde3
/dev/disk/by-id:
total 0
lrwxrwxrwx 1 root root 9 Feb 14 19:17 ata-HL-DT-ST_DVD-RAM_GHC0N_K9SG1AI1403 -> ../../sr0
lrwxrwxrwx 1 root root 9 Feb 14 19:17 ata-WDC_WD30EZRZ-00GXCB0_WD-WCC7K1NJRX1H -> ../../sdb
lrwxrwxrwx 1 root root 9 Feb 14 19:17 ata-WDC_WD30EZRZ-00GXCB0_WD-WCC7K5PF586Y -> ../../sdd
lrwxrwxrwx 1 root root 9 Feb 14 19:17 ata-WDC_WD30EZRZ-00GXCB0_WD-WCC7K6EU0Y70 -> ../../sda
lrwxrwxrwx 1 root root 9 Feb 14 19:17 ata-WDC_WD30EZRZ-00Z5HB0_WD-WCC4N0EFT5A4 -> ../../sdc
/dev/disk/by-path:
total 0
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-1 -> ../../sda
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-1.0 -> ../../sda
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-2 -> ../../sdb
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-2.0 -> ../../sdb
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-3 -> ../../sdc
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-3.0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-6 -> ../../sdd
lrwxrwxrwx 1 root root 9 Feb 14 19:17 pci-0000:02:00.1-ata-6.0 -> ../../sdd
and the same continues for each of those listings from ls
It is quite apparent that all the 4 raid devices are seen and recognized as raid members.
Yet the listings fail to reveal that the raid array is assembled and activated. It should appear as /dev/md127 but does not.
As a result the lvm on that array is not found and cannot be used.
At about 3.5 sec into the boot each of those 4 devices gets this type setup
[ 3.474117] raptor.home.domain kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3.474121] raptor.home.domain kernel: ata1.00: ATA-10: WDC WD30EZRZ-00GXCB0, 80.00A80, max UDMA/133
[ 3.474125] raptor.home.domain kernel: ata1.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 32), AA
[ 3.474129] raptor.home.domain kernel: ata1.00: configured for UDMA/133
[ 3.474236] raptor.home.domain kernel: scsi 0:0:0:0: Direct-Access ATA WDC WD30EZRZ-00G 0A80 PQ: 0 ANSI: 5
[ 3.474332] raptor.home.domain kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 3.474427] raptor.home.domain kernel: sd 0:0:0:0: [sda] 5860533168 512-byte logical blocks: (3.00 TB/2.73 TiB)
[ 3.474521] raptor.home.domain kernel: sd 0:0:0:0: [sda] 4096-byte physical blocks
[ 3.474615] raptor.home.domain kernel: sd 0:0:0:0: [sda] Write Protect is off
[ 3.474709] raptor.home.domain kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 3.474807] raptor.home.domain kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 3.474901] raptor.home.domain kernel: sd 0:0:0:0: [sda] Preferred minimum I/O size 4096 bytes
[ 3.474996] raptor.home.domain kernel: sd 0:0:0:0: [sda] Attached SCSI disk
Finally the errors start like this
[ 4.452427] raptor.home.domain systemd[1]: Mounted sys-kernel-config.mount - Kernel Configuration File System.
[ 4.452726] raptor.home.domain systemd[1]: Reached target sysinit.target - System Initialization.
[ 4.452773] raptor.home.domain systemd[1]: Reached target basic.target - Basic System.
[ 4.475546] raptor.home.domain dracut-initqueue[703]: fedora_raptor/root linear
[ 4.492842] raptor.home.domain systemd[1]: Found device dev-mapper-fedora_raptor\x2droot.device - /dev/mapper/fedora_raptor-root.
[ 4.492881] raptor.home.domain systemd[1]: Reached target initrd-root-device.target - Initrd Root Device.
...
[ 130.432583] raptor.home.domain dracut-initqueue[578]: Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks:
[ 130.433561] raptor.home.domain dracut-initqueue[578]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2ffedora_raid\x2fhome.sh: "[ -e "/dev/fedora_raid/home" ]"
which continues for an additional 60 seconds
191.787267] raptor.home.domain dracut-initqueue[578]: Warning: dracut-initqueue: starting timeout scripts
[ 192.301399] raptor.home.domain dracut-initqueue[578]: Warning: dracut-initqueue: timeout, still waiting for following initqueue hooks:
[ 192.302408] raptor.home.domain dracut-initqueue[578]: Warning: /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2ffedora_raid\x2fhome.sh: "[ -e "/dev/fedora_raid/home" ]"
[ 192.303565] raptor.home.domain dracut-initqueue[578]: Warning: dracut-initqueue: starting timeout scripts
[ 192.303644] raptor.home.domain dracut-initqueue[578]: Warning: Could not boot.
[ 192.312126] raptor.home.domain systemd[1]: Starting dracut-emergency.service - Dracut Emergency Shell...
[ 192.338357] raptor.home.domain systemd[1]: Received SIGRTMIN+21 from PID 586 (plymouthd).
[ 192.348366] raptor.home.domain systemd[1]: Received SIGRTMIN+21 from PID 586 (plymouthd).
and dumps me to an emergency shell.
It seems quite obvious that something on the main system has gotten damaged and will not properly activate the raid array and its contained LV, but booting from a live media does activate the array.
I don’t want to do a complete new installation since I have many apps running on this server that would be a pain to reinstall but am lost as to what actions may solve the problem.
I have used the latest respin dated 20250214 and tried recreating the initramfs in a chroot to see if that would work, but no progress. Dracut did create the new initramfs for kernel 6.12.11 and lsinitrd on that image shows that the raid kernel modules are included yet the array is not activated during boot and the messages shown above are the result with every attempt, regardless of the kernel used for boot (in the installed system).
Are there any suggestions on what to try next before I use the nuclear option and reinstall fresh.?