Fedora 41 - Unable to Access 4TB HDD - "Superblock Error" and I/O Errors (DID_BAD_TARGET)

Hello,

I am encountering a persistent issue in Fedora 41 Workstation where I am unable to access my 4TB HDD. When I try to mount the Btrfs partition on this drive, I get an “unable to read superblock” error. The dmesg output shows repeated “FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK” and I/O errors when Fedora 41 tries to access the drive.

This same 4TB HDD works perfectly fine in PopOS 24.04 and also worked without any issues in Fedora 40 live USB. This leads me to believe the problem is specific to Fedora 41’s environment or configuration.

Here’s a summary of the situation:

  • Operating System: Fedora 41 Workstation (problem OS), PopOS 24.04 (working OS), Fedora 40 (previously working OS, and Fedora 40 Live USB works).
  • Hard Drive: 4TB HDD, Model Family: Seagate BarraCuda 3.5 (SMR), Model: ST4000DM004-2CV1, formatted with Btrfs.
  • Problem: Cannot mount Btrfs partition on the 4TB HDD in Fedora 41. “Superblock error” during mount. dmesg shows DID_BAD_TARGET and I/O errors. lsblk shows 3.6T but still no mount. smartctl -a /dev/sda, parted /dev/sda print, and gdisk -l /dev/sda all fail with errors in Fedora 41.
  • Working Environments: The drive mounts and works correctly in PopOS 24.04 and and Fedora 40 Live USB. smartctl reports “PASSED” health in PopOS and Fedora 40 Live.

Troubleshooting Steps Already Attempted in Fedora 41 (without success):

  • Disabled SELinux (for testing).
  • Tried explicitly mounting with sudo mount -t btrfs /dev/sda1 /mnt/test.
  • Booted Fedora 41 with older Fedora 41 kernel (kernel-6.8.0-63.fc41) installed via koji.
  • Booted Fedora 41 with kernel parameters libata.force=1.5:sda and libata.force=3.0:sda.
  • Changed SATA ports and potentially SATA controllers on the motherboard.

Key Command Outputs (Errors in Fedora 41):

  • dmesg (errors excerpt):
    [ ... ] sd 0:0:0:0: [sda] tag#14 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
    [ ... ] I/O error, dev sda, sector 2176 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
    [ ... ] Buffer I/O error on dev sda1, logical block 16, async page read
    [ ... ] Read Capacity(16) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    [ ... ] Sense not available.
    [ ... ] sd 0:0:0:0: [sda] 0 512-byte logical blocks: (0 B/0 B)
    [ ... ] sda: detected capacity change from 7814037168 to 0
  • smartctl -a /dev/sda: Smartctl open device: /dev/sda failed: INQUIRY failed
  • parted /dev/sda print: Warning: Error fsyncing/closing /dev/sda1: Input/output error
  • gdisk -l /dev/sda: Warning! Read error 5; strange behavior now likely! .
  • mount command: mount: /mnt/test: can't read superblock on /dev/sda1.

Successful Outputs (Fedora 40 Live USB):

  • dmesg (excerpt - showing successful mount):
    [ ... ] BTRFS: device label backup devid 1 transid 17 /dev/sda1 scanned by mount (3672)
    [ ... ] BTRFS info (device sda1): first mount of filesystem ...
    [ ... ] BTRFS info (device sda1): using crc32c (crc32c-intel) checksum algorithm
    [ ... ] BTRFS info (device sda1): using free-space-tree
  • smartctl -a /dev/sda: SMART overall-health self-assessment test result: PASSED
  • parted /dev/sda print: Shows correct partition table.

The same issue persists with the ext4 format as well. Let me know if you need any further information. I would be grateful for any suggestions.

1 Like

I think your disk is failing. You can check/replace cables to eliminate that as the cause of the errors.

You could try sudo smartctl -x /dev/sda to see if you can get the SMART counters.

1 Like

I tried sudo smartctl -x /dev/sda on PopOS 24.04 (as this HDD gives an I/O error on Fedora 41). The output is as follows:

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      4211         -

The HDD is not older than 1.5 years and also not used often. I am sure that the HDD is not failing, as it works perfectly on Fedora 40 Live USB and PopOS 24.04.

I have also tried changing SATA ports and cables. But it didn’t help.

You will need to report the problem to the linux kernel in that case.
They will want to know kernel versions that work and the versions that break.

FYI a fully updated f40 will have the same kernel as f41.
The live image of f40 just has an old kernel.

I am not sure if it is a kernel issue completely.

I had installed Fedora 40 and updated it to kernel version 6.13.4, and Fedora 41 also had the same kernel version at the time. But even then, the HDD worked on Fedora 40 and not on Fedora 41. The HDD doesn’t work on Fedora 41 Live USB as well, which comes with kernel version 6.11.3.

Should I report this to https://bugzilla.redhat.com/ ?

That’s a good place to file bugs against fedora.

But I’m not sure what component you file the bug against.
Given your sure it is not a kernel issue.
But the error messages are really convincing that it is a kernel issue given you are sure it is not a hardware issue.

I am only sure about the hardware, as the HDD previously worked on f40 and currently works on f40 Live USB, PopOS 24.04, and also on Windows (I have not installed Windows on the same system).

As you said and also, I have observed that a fully updated f40 will have the same version of the kernel as f41. So I can’t file a bug in the Linux kernel. Even though f40 and and f41 have same version of the kernel, they might be configured differently as the exact kernels versions are kernel-6.13.4-100.fc40, kernel-6.13.4-200.fc41 that specific to f40 and f41. So I will will file the bug report under kernel for the time being.

I am wondering if the culprit is some power-saving mechanism deployed for F41 that doesn’t play nicely with the sata interface…

2 Likes

Given that the issue appears not to be widespread, this might be UEFI/BIOS or specific hardware. It could be useful to have the inxi -Fzxx output, and journalctl may show some power management details that differ between F40 and F41.

F41 switched to tuned. Which may have a difference in power profiles?

1 Like

@barryascott Thank you for pointing out “tuned”. I replaced “tuned” with older power profile tool “power-profiles-daemon”. The HDD works perfectly after this change.

@augenauf Thank you very much for pointing in the right direction.

Let me know if I need to add any additional information for future reference.

2 Likes

Can you raise a bug against tuned in fedora bug tracker with the details of what tuned broke please?
They will want hardware details an lspci should provide a good starting point.

1 Like

This seems to indicate a kernel issue, though the fact it does not work with an f41 live usb muddies it up a bit. The fact that it does not work with tuned and does work with the power-profile-daemon seems it may be firmware related.

Have you tried using fwupdmgr to see if it may be the firmware on the hdd that is the issue. There are many more things in f41 that changed from what was originally in f40 and not just the kernel. Many of those same software bits were also back ported to f40. The original release iso for f40 would not have any of those changes.

Some of those changes also affected what hardware is supported (or at least the firmware versions on the hardware)

The posted dmesg is filtered so I can’t be sure, but this looks like an uncorrectable read error, i.e. a bad sector.

There’s a couple ways to deal with this.

  1. Issue a write to the full physical sector.
  2. Partition the drive to exclude this sector.

1a. Sector 2176 is within 2MiB of the start of the drive. If you don’t care about the data on the drive at all, you can overwrite everything in the first 5MiB. The overwrite will either fix the problem without any messages, or it might need to remap the LBAs for this bad sector to a reserve physical sector. As long as there aren’t any errors reported to the kernel indicating this failed, then you can probably assume it worked. And then repartition and reformat as you were planning.

This command is dangerous, if you point it to the wrong drive it will cause irreversible data loss before you can cancel it.

dd if=/dev/zero of=/dev/sdX bs=128K count=40 oflag=direct

OR

1b. Over write only the single block. blockdev can help determine physical and logical sector sizes. If they aren’t the same, this can be confusing. 512 logical and 4096 physical is common for hard drives, the problem with the sector value is it’s based on 512 byte sectors. If you attempt write over that single logical sector the drive firmware turns that write into a read of the 4096 byte sector because it’s not possible to write only 512 bytes within a 4096 byte sector.

Therefore if you use dd to overwrite you need to tell it to use 4096 byte blocks and also need to convert the 512 byte logical block address in the error message to its 4096 block address, and then limit the count to overwriting just this 1 4096 byte block.

OR

  1. You can create a small partition with LBA start and end that includes the bad sector LBA. And then create a 2nd partition for the volume you want to create, and then format that partition normally.