This link gives a lot of info on managing and replacing a failed drive.
https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive
Essentially you need to both fail and remove the bad drive as well as adding the replacement.
That’s not what I would have expected.
Added another spare and failed (and removed) the other disk with many errors. After finishing syncing the array, the array still automatically starts in auto-read-only, but when I mount it auto-read-only ‘disappears’.
Before mounting:
Personalities : [raid6] [raid5] [raid4]
md126 : active (auto-read-only) raid5 sdc[2] sdb[1] sda[0]
1953519616 blocks super external:/md127/0 level 5, 64k chunk, algorithm 0 [3/3] [UUU]
md127 : inactive sdb[2](S) sdc[1](S) sda[0](S)
8328 blocks super external:imsm
unused devices: <none>
After mounting
Personalities : [raid6] [raid5] [raid4]
md126 : active raid5 sdc[2] sdb[1] sda[0]
1953519616 blocks super external:/md127/0 level 5, 64k chunk, algorithm 0 [3/3] [UUU]
md127 : inactive sdb[2](S) sdc[1](S) sda[0](S)
8328 blocks super external:imsm
unused devices: <none>
Hmmm
I suspect that is due to the duplicate raid arrays. The configs may be confused since it was originally built as md127. I would try stopping both arrays, then power off and wait a few minutes before rebooting (while they are deactivated). Then see which array is now seen with cat /proc/mdstat
.
You can start and activate the one you want to use (hopefully the original designation).
According to this source (Linux software RAID array is in auto-read-only mode) it isn’t usually an issue.
This most commonly happens after a restart/power-event and isn’t usually an issue - MD arrays will be auto-read-only until they’re first written too. It happens to try and help make array assembly a bit safer - nothing’s written to disk until it actually needs to be
I guess this behaviour is normal. I mean, if I mount the array I don’t get an error and the active array (md126) ‘automagically’ becomes writable (not read-only).
If I go way way back in my boot history (before the issues started) with journalctl then I see lines like:
okt 17 11:38:37 kernel: md/raid:md126: device sda operational as raid disk 0
okt 17 11:38:37 kernel: md/raid:md126: device sdc operational as raid disk 1
okt 17 11:38:37 kernel: md/raid:md126: device sdd operational as raid disk 2
An active (md126) and an inactive (md127) array comes probably with my use of the BIOS RAID aka fake RAID which has no real RAID controller and therefore is a software RAID. Eventually all is controlled by mdadm but somehow makes use of the RAID configuration in the BIOS with Intel Matrix Storage Manager (imsm).
Detailed information of both MD devices. Inactive array (md127) has raid level container which contains the disks and rebuilds during startup the active array (md126) with raid level raid5.
I guess that’s how it works in my desktop.
bash-5.2$ ls -l /dev/md/
total 0
lrwxrwxrwx. 1 root root 8 15 mrt 18:09 imsm0 -> ../md127
lrwxrwxrwx. 1 root root 8 15 mrt 18:09 RAID5_2TB_0 -> ../md126
bash-5.2$ sudo mdadm --detail --scan --verbose
ARRAY /dev/md/imsm0 level=container num-devices=3 metadata=imsm UUID=fcaaa905:3813afd6:892f86ab:83424a41
devices=/dev/sda,/dev/sdb,/dev/sdc
ARRAY /dev/md/RAID5_2TB_0 level=raid5 num-devices=3 container=/dev/md/imsm0 member=0 UUID=5aca63f5:865b38a5:fec0f38e:a41e95cb
devices=/dev/sda,/dev/sdb,/dev/sdc
bash-5.2$ sudo mdadm --detail /dev/md127
/dev/md127:
Version : imsm
Raid Level : container
Total Devices : 3
Working Devices : 3
UUID : fcaaa905:3813afd6:892f86ab:83424a41
Member Arrays : /dev/md/RAID5_2TB_0
Number Major Minor RaidDevice
- 8 32 - /dev/sdc
- 8 0 - /dev/sda
- 8 16 - /dev/sdb
bash-5.2$ sudo mdadm --detail /dev/md126
/dev/md126:
Container : /dev/md/imsm0, member 0
Raid Level : raid5
Array Size : 1953519616 (1863.02 GiB 2000.40 GB)
Used Dev Size : 976759808 (931.51 GiB 1000.20 GB)
Raid Devices : 3
Total Devices : 3
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Layout : left-asymmetric
Chunk Size : 64K
Consistency Policy : resync
UUID : 5aca63f5:865b38a5:fec0f38e:a41e95cb
Number Major Minor RaidDevice State
2 8 32 0 active sync /dev/sdc
1 8 16 1 active sync /dev/sdb
0 8 0 2 active sync /dev/sda
I guess that may be due to using hardware (bios) raid as noted but to me that issue with having 2 different arrays shown and both with the same devices would be extremely confusing.
Personally I would rather have only one array defined and use mdadm instead of dealing with the confusion caused by the bios raid. Particularly since it seems you use mdadm for management anyway.
I agree! That’s why there was confusion in the beginning. This way of getting an array working is also not well documented. But luckily with your help and many sources I began (with trial and error) to understand how my array was set up. Especially the errors caused many aha-moments
Many thanks @computersavvy !