Thanks for looking in advance.
Fedora server 38.
I have constructed a raid6 array of ssd:
/dev/md127:
Version : 1.2
Creation Time : Sun Nov 19 07:54:31 2023
Raid Level : raid6
Array Size : 19534423040 (18.19 TiB 20.00 TB)
Used Dev Size : 3906884608 (3.64 TiB 4.00 TB)
Raid Devices : 7
Total Devices : 7
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sun Nov 19 15:36:27 2023
State : active
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : zone9:store2-md (local to host zone9)
UUID : ccfd514d:7e68060e:530338f7:a0e0a9e3
Events : 3593
Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
2 8 65 2 active sync /dev/sde1
3 8 81 3 active sync /dev/sdf1
4 8 97 4 active sync /dev/sdg1
5 8 113 5 active sync /dev/sdh1
6 8 129 6 active sync /dev/sdi1
Every device supports trim/DZAT (drives c-i):
# lsblk -Dd
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda 0 0B 0B 0
sdb 0 0B 0B 0
sdc 0 512B 128M 0
sdd 0 512B 128M 0
sde 0 512B 128M 0
sdf 0 512B 128M 0
sdg 0 512B 128M 0
sdh 0 512B 128M 0
sdi 0 512B 128M 0
sdj 0 0B 0B 0
sdk 0 0B 0B 0
sdl 0 0B 0B 0
sdm 0 0B 0B 0
sdn 0 512B 128M 0
sdo 0 0B 0B 0
zram0 0 4K 2T 0
nvme0n1 0 512B 2T 0
# hdparm -I /dev/sd[cdefghi] | grep -i trim
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
* Data Set Management TRIM supported (limit 8 blocks)
* Deterministic read ZEROs after TRIM
Mod param devices_handle_discard_safely is enabled:
# cat /sys/module/raid456/parameters/devices_handle_discard_safely
Y
Logs show problems with drives during fstrim operation
[ 6391.089300] sd 6:0:1:0: [sdd] tag#4386 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[ 6391.089302] scsi target6:0:1: handle(0x001a), sas_address(0x300062b203e56bc2), phy(2)
[ 6391.089305] scsi target6:0:1: enclosure logical id(0x500062b203e56bc0), slot(0)
[ 6391.089308] scsi target6:0:1: enclosure level(0x0000), connector name( )
[ 6391.089310] sd 6:0:1:0: No reference found at driver, assuming scmd(0x0000000074119868) might have completed
[ 6391.089312] sd 6:0:1:0: task abort: SUCCESS scmd(0x0000000074119868)
[ 6391.596882] sd 6:0:6:0: Power-on or device reset occurred
[ 6391.597892] sd 6:0:1:0: Power-on or device reset occurred
[ 6426.358689] sd 6:0:3:0: attempting task abort!scmd(0x00000000f67e2fd4), outstanding for 30110 ms & timeout 30000 ms
[ 6426.358694] sd 6:0:3:0: [sdf] tag#5166 CDB: Write(16) 8a 08 00 00 00 00 00 00 08 10 00 00 00 08 00 00
[ 6426.358695] scsi target6:0:3: handle(0x001c), sas_address(0x300062b203e56bc4), phy(4)
[ 6426.358697] scsi target6:0:3: enclosure logical id(0x500062b203e56bc0), slot(7)
[ 6426.358698] scsi target6:0:3: enclosure level(0x0000), connector name( )
[ 6426.388344] sd 6:0:3:0: task abort: SUCCESS scmd(0x00000000f67e2fd4)
[ 6426.388386] sd 6:0:3:0: attempting task abort!scmd(0x0000000083d2ef3a), outstanding for 30140 ms & timeout 30000 ms
[ 6426.388394] sd 6:0:3:0: [sdf] tag#5161 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[ 6426.388398] scsi target6:0:3: handle(0x001c), sas_address(0x300062b203e56bc4), phy(4)
[ 6426.388405] scsi target6:0:3: enclosure logical id(0x500062b203e56bc0), slot(7)
[ 6426.388410] scsi target6:0:3: enclosure level(0x0000), connector name( )
[ 6426.388415] sd 6:0:3:0: No reference found at driver, assuming scmd(0x0000000083d2ef3a) might have completed
[ 6426.388419] sd 6:0:3:0: task abort: SUCCESS scmd(0x0000000083d2ef3a)
[ 6427.097362] sd 6:0:3:0: Power-on or device reset occurred
[ 6462.709675] sd 6:0:3:0: attempting task abort!scmd(0x000000007420dd31), outstanding for 30408 ms & timeout 30000 ms
[ 6462.709688] sd 6:0:3:0: [sdf] tag#597 CDB: Write(16) 8a 08 00 00 00 00 00 00 08 10 00 00 00 08 00 00
[ 6462.709692] scsi target6:0:3: handle(0x001c), sas_address(0x300062b203e56bc4), phy(4)
[ 6462.709699] scsi target6:0:3: enclosure logical id(0x500062b203e56bc0), slot(7)
[ 6462.709703] scsi target6:0:3: enclosure level(0x0000), connector name( )
[ 6462.739637] sd 6:0:3:0: task abort: SUCCESS scmd(0x000000007420dd31)
[ 6462.739678] sd 6:0:3:0: attempting task abort!scmd(0x00000000d7fa046e), outstanding for 30438 ms & timeout 30000 ms
[ 6462.739686] sd 6:0:3:0: [sdf] tag#592 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[ 6462.739690] scsi target6:0:3: handle(0x001c), sas_address(0x300062b203e56bc4), phy(4)
[ 6462.739698] scsi target6:0:3: enclosure logical id(0x500062b203e56bc0), slot(7)
[ 6462.739702] scsi target6:0:3: enclosure level(0x0000), connector name( )
[ 6462.739707] sd 6:0:3:0: No reference found at driver, assuming scmd(0x00000000d7fa046e) might have completed
[ 6462.739711] sd 6:0:3:0: task abort: SUCCESS scmd(0x00000000d7fa046e)
[ 6463.347964] sd 6:0:3:0: Power-on or device reset occurred
I’m out of ideas as how to fix this, though am wondering why DISC-GRAN
is larger for the md device than the underlying hardware:
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sdc 0 512B 128M 0
└─sdc1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sdd 0 512B 128M 0
└─sdd1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sde 0 512B 128M 0
└─sde1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sdf 0 512B 128M 0
└─sdf1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sdg 0 512B 128M 0
└─sdg1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sdh 0 512B 128M 0
└─sdh1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
sdi 0 512B 128M 0
└─sdi1 0 512B 128M 0
└─md127 0 4M 128M 0
└─store2--vg-store2--lv 0 4M 128M 0
How do I fix/what are my options?