Disk is likely to fail soon, but all SMART assessments are OK

I get the warning in red: DISK IS LIKELY TO FAIL SOON

This is a screenshot of gnome-disk-utility 46.1:

I’m surprised that all individual Assessments are OK.

What could be the root cause? Why are there no non-OK Assesments.

Any advice? (backup is running)

1 Like

I’ve started to get these warnings too, it might be related to this udisks2 update: FEDORA-2025-6ef0c40f95 — security update for udisks2 — Fedora Updates System

Not sure if it’s a bug that it now shows “might fail soon” warnings when the disk is actually OK, or if that warning should have been shown all along and the fact that it wasn’t was a bug that is now fixed.

4 Likes

Can you post the output of this (replace DISK with your disks name) as pre-formatted text, not a screen shot.

sudo smartctl -x /dev/DISK

With that info we can see if there is data to back up the fail-soon warning,

1 Like

Yes, this looks the same as: 2374194 – During dnf update to udisks2-2.10.90-3.fc42, gnome-disks Overall Assessment changes to "DISK IS LIKELY TO FAIL SOON" despite no indication of new disk problems

I will attach the outcome of sudo smartctl -x /dev/DISK to that bug, and here.

1 Like

Here are two logs, in between I ran SMART test:

log from smartclt -x at 16:31

log from smartclt -x at 16:46

Having the same issue, but only after I did an update today to Fedora Silverblue 42 after not using the computer for about a week:

$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/42/x86_64/silverblue
                  Version: 42.20250622.0 (2025-06-22T00:38:53Z)
               BaseCommit: 8dbd8496cbd83c9b811f321653ed314846edb4939d593b20e8e8d04ed58fc17f
             GPGSignature: Valid signature by B0F4950458F69E1150C6C5EDC8AC4916105EF944
      RemovedBasePackages: firefox firefox-langpacks 139.0.4-1.fc42
          LayeredPackages: htop syncthing tmux

  fedora:fedora/42/x86_64/silverblue
                  Version: 42.20250616.0 (2025-06-16T01:23:48Z)
               BaseCommit: b4e3d94d084834d874fdb655981a5458f943ad8472c64c119ae52a082b17e504
             GPGSignature: Valid signature by B0F4950458F69E1150C6C5EDC8AC4916105EF944
      RemovedBasePackages: firefox firefox-langpacks 139.0.4-1.fc42
          LayeredPackages: htop syncthing tmux

The disk in question is the second drive in my laptop, which is used for OpenBSD. It’s a SATA SSD.

1 Like

I can confirm that the “disk likely to fail soon” warning only comes up in the newer deployment. The older one doesn’t give the warning.

Disks says, “Disk is OK, 17 bad sectors”:

2 Likes

It seems you are getting close to failure:

202 Percent_Lifetime_Remain ----CK   089   089   001    -    11

That seems to be saying that your SSD has only 11% of its life left.

And these seem to indicate issues reading and writing:

  1 Raw_Read_Error_Rate     POSR-K   100   100   000    -    16
  5 Reallocate_NAND_Blk_Cnt -O--CK   100   100   010    -    7
206 Write_Error_Rate        -OSR--   100   100   000    -    4

And you have had the drive power up for 2.4 years

  9 Power_On_Hours          -O--CK   100   100   000    -    21672
1 Like

Even then, I do find it strange that at least dozens of people (looking at this thread, bugzilla, the udisks2 bodhi update, and reddit) started getting the “DISK LIKELY TO FAIL SOON” warning only after a recent update, even though SMART self-tests seem to indicate everything is still OK or at least within thresholds.

The only change in udisks2 that was pushed recently was this one:
https://src.fedoraproject.org/rpms/udisks2/c/8ac11b153fea9edfa138b746c97456222d9d19ef?branch=f42

Which does not look like it could cause this to happen at all, so I’m very confused.

1 Like

Any chance it was the recent update to libblockdev ?

The detail is well beyond my grasp really, but did the latest commit remove a SMART-related patch?

- # https://issues.redhat.com/browse/RHEL-80620
- Patch1:      libatasmart-overall_drive_self-assessment.patch 

That removed patch seems to have been specifically designed to prevent reporting of failure warnings by libatasmart which (in the patch author’s opinion) were overly sensitive:

The libatasmart attribute overall status differs slightly from
the drive SMART self-assessment and is very sensitive for particular
status values. Such status should fit more like a pre-fail warning,
no reason to fail hard the global assessment. Even a single reallocated
sector would cause a warning, while the drive could be quite healthy
otherwise.

If “even a single reallocated sector” can cause a warning in the absence of the patch, it would explain why the disks in the screenshots in this thread (@janvlug 's with 7 reallocated sectors and @passthejoe 's with 17) started to report imminent failure with the latest (unpatched) libblockdev, where they hadn’t done so with the previous patched version.

4 Likes

Without the smart data it is not possible to know for sure.

At least in this case the warning is correct.

2 Likes

It looks like libblockdev 3.3.1-2 (currently in Rawhide) restores the patch and presumably will restore the previous reporting behaviour: Commit - rpms/libblockdev - c3a88ad70a91b2ed89ebdd6c0d727c7d45ba7c8a - src.fedoraproject.org

1 Like

I have 3 crucial micron and the last number, 11 here, indicates % of life used so far, not remaining. In other words, it increases starting from 0. Therefore, 89% life remaining. Maybe it’s particular to crucial, I don’t know, but I can assure you it’s meant to increase, not decrease. Here’s my new ssd data:

202 Percent_Lifetime_Remain 0x0030   100   100   001    Old_age   Offline      -       0

https://www.perplexity.ai/search/how-to-interpret-this-ssd-data-sbaEYK0kQXaELZjFONeiKA?login-new=true&login-source=visitorGate

2 Likes

So what you’re saying is either there is a miscommunication somewhere in the stack concerning the intent of the field, or somehow crucial has created Benjamin Button drives, drives that age backwards in time.

I short search for a SMART spec turned up nothing recent.
My SSD’s do not have the 202 attribute.

It used to be necessary to sign an NDA to get a true meaning of a manufacturers SMART attirbutes.

1 Like

Exactly the same on my Crucial drive (which is 3 years old or so).

Wow, it’s like new. My Win ssd, around 6 years old is at 83% and my Fedora ssd, purchased in oct 2023 is at 80% (it ages faster due to bi-yearly upgrades since F38)

This one isn’t my OS drive but it’s had a decent amount of data written to it.

So if I trust the data and assume that 0 is valid to the nearest integer percentage, then the drive could have gone through up to 0.5% of its life… it should last another 597 years!!

Did you turn atime updating in the mount options?
That will remove lots of meta data writes to an ssd that is often not useful.

In fact I’ve been using realatime. It seems to be the Fedora default, correct me if I’m wrong.

For the LUKS btrfs partition → btrfs rw,relatime,compress=zstd:1,ssd,discard=async,space_cache=v2,commit=300,subvolid=405,subvol=/root

For the boot → ext4 rw,relatime

What didn’t help is when I restored from backup using dd because of LUKS. I stopped doing that and now use partclone.btrfs since I learned about it. But I had several full ssd dd restore to my account due fo failed upgrades or snatshots that went wrong prior to that.

I use commit=300 because this PC is immune to power failure.

Here’s what Grok has to say about realatime vs atime, if that info could be trusted →

atime vs. relatime

atime (Access Time)

  • Definition: Updates the access time (last read time) of a file or directory every time it is accessed (e.g., read, executed).
  • Mount Option: Enabled by default or explicitly set with atime.
  • Impact on SSD Longevity: Increases write operations because each file access updates the inode on disk, leading to more wear on SSDs.
  • Use Case: Useful for tracking file access (e.g., auditing, backups), but rarely needed in modern systems.
  • Performance: Higher I/O overhead, slightly reducing performance and accelerating SSD wear.

relatime (Relative Access Time)

  • Definition: Updates the access time only if the previous atime is older than the modification time (mtime) or if the file hasn’t been accessed in at least 24 hours (or the mount time). It’s a compromise between atime and noatime.
  • Mount Option: Default on most modern Linux distributions (including Fedora 42) unless overridden.
  • Impact on SSD Longevity: Significantly reduces write operations compared to atime, as it avoids frequent updates, making it SSD-friendly.
  • Use Case: Balances access tracking with performance, suitable for most general-purpose systems.
  • Performance: Lower I/O overhead than atime, improving longevity and efficiency.

Comparison

Feature atime relatime
Access Update Every read Only if older than mtime or 24h
SSD Wear Higher Lower
Performance Slower due to more I/O Faster, less I/O
Use Case Auditing, specific apps General use, defaults
Default Rarely used now Common default

Recommendation

  • Use relatime for SSDs (and most systems) to minimize wear while retaining some access tracking.
  • Switch to noatime if you don’t need access time at all, offering the best longevity (e.g., for servers or read-heavy workloads).
  • Avoid atime on SSDs unless required for specific compliance or debugging purposes.