A question about a faulty disk

blind-confused · December 11, 2024, 9:22pm

this question might seem silly, but I’m a computer newbie, so
I purchased my laptop (and the hdd in it) back in 2018, and I guess that can be enough to wear out an hdd.

the issue is, just 2 weeks after installing Fedora, the system and the apps get noticeably slower, lags now happen a lot more often, sometimes some apps crash. At the same time, first few days after install, everything is very fast, no issues.

someone said it’s probably because, some time after install, I reach my hdd’s faulty sectors, after I told them that my hdd has tons of read and/or write errors (and some other errors too). This was found out through a SMART scan (which somehow still gave it a “Disk is OK” overall assestment). Sending the results of it here.

I know you’d advice me to replace is with a good ssd. I would totally do that, if there was an opportunity right now. Yes, my situation is that difficult. Perhaps someday I can. We’ll see.

so while it hasn’t happened yet, my question is:
is there a way to determine which exact sectors of my disk are faulty, where they are, and then somehow reinstall my system and allocate it while avoiding them? My hdd is around 900 GBs, and my system uses from 20 to 50 GBs at average, so… perhaps it’s somehow possible?

einer · December 11, 2024, 9:37pm

Hi Blind,

Looking at the report from SMART, your HDD is fine … so, let’s see if we can figure out the actual problem

Here are a couple of things to look at:

open a terminal / command line
type “top” at the prompt
How much MiB Mem do you have? (RAM)
How much MiB Swap (total, free,used)
when things seem to be slowing down, hit 1 on the keyboard in the top window
Here is what to look for:
look at the CPU utilization per CPU — are any pegged at 100%?
are there any processes hitting 100% or greater?
in the column with “n.n wa,” what values are you seeing there?

Post the results here and I’ll have a look to see if I can help

barryascott · December 11, 2024, 9:42pm

The drive will attempt to replace bad sectors itself, but only in response to a failure to read data (not sure a write will work).

I am not sure what the best way is to force rhe reallocation.
Maybe try reading the whole disk with dd?
Others may have better suggestions.

einer · December 11, 2024, 9:48pm

Hi Barry,

Most HDD/SDD will relocate data automatically when a sector is marked bad by the on-board disk controller … IF there are any good sectors available … Per the report, the Relocated Sector Count is 0 … so, not likely bad sectors

blind-confused · December 11, 2024, 9:51pm

hmm, I’m surprised, considering that it does show that at least 1 sector is unfixable

just in case I misunderstood the questions, here are the results:
(also, keep in mind, this is results with Firefox playing a video, having this forum open, and the Discord app too)

top - 23:44:08 up  4:18,  2 users,  load average: 3,58, 3,25, 3,16
Tasks: 329 total,   1 running, 327 sleeping,   0 stopped,   1 zombie
%Cpu(s): 15,2 us,  4,9 sy,  0,0 ni, 77,5 id,  0,0 wa,  1,8 hi,  0,6 si,  0,0 st 
MiB Mem :   6877,2 total,    570,7 free,   4148,8 used,   2449,0 buff/cache     
MiB Swap:   6877,0 total,   6013,0 free,    864,0 used.   2728,4 avail Mem

hit 1 in the top window?.. I’m sorry, I don’t quite get that
also, now that you say “when things seeem to be slowing down.”
it does (obviously) happen when opening a lot of tabs. But there were also several times when the system was very slow right after launch. I would only open the terminal, and even that would take minutes to launch. Also, every first-launch-per-session of Firefox is really slow, the app becomes unresponsive a lot, until a few minutes later.

is that visible from the result I sent, or…?

not any that I could see, no

… to be honest with you, I can’t find that column anywhere lol

you mean, to just run dd (this exact command, with nothing added to it)?

blind-confused · December 11, 2024, 9:55pm

oh right, I forgot about another issue that I suppose I should mention.
I use ethernet, and sometimes, when opening a lot of tabs at the same time (usually with at least one video playing), my internet connection stops loading anything, and it’s tray icon goes “?”. My connections go up to 5000 or 10000 (I know this through an app that shows active connections). It is fixed easily by closing Firefox and waiting a minute. Then everything loads again, the tray icon looks normal, and I can reopen Firefox. This happens at least once or twice a day.
I don’t experience any of this on any other devices connected to the same network.

einer · December 11, 2024, 10:03pm

Hi Blind,

`

%Cpu(s): 15,2 us,  4,9 sy,  0,0 ni, 77,5 id,  **0,0 wa**,  1,8 hi,  0,6 si,  0,0 st 
MiB Mem :   6877,2 total,    570,7 free,   4148,8 used,   2449,0 buff/cache     
MiB Swap:   6877,0 total,   6013,0 free,    864,0 used.   2728,4 avail Mem

Ok, You have 6.8GB of RAM installed with 4GB consumed
and
You are starting to use a little Swap (.86GB)
You are not waiting on disk I/O (0,0 wa) at the moment

Ok, next small test, let’s see how fast the disk is …

open a terminal
type cd – let’s make certain we are in your home directory
dd if=/dev/zero of=TEST.DD bs=4096 count=1M — this will create a file called TEST.DD and fill it with zeros just as fast as the disk can take it
do #3 about 10 times in fast succession — we want to fill the buffer so that we can get the buffer-to-platter transfer rate
now do dd if=TEST.DD of=/dev/null — this gives us an idea on how fast we can read form the same disk

What are your results?

einer · December 11, 2024, 10:14pm

… Hmmm …
Model and brand of your computer?

blind-confused · December 11, 2024, 10:36pm

1048576+0 records in
1048576+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 29,2285 s, 147 MB/s

… I hope I understood this correctly, and didn’t make a bad mistake.
I opened a lot of terminal tabs, running this same command in each one (simultaneously). I thought I should do that, since you said “fast succession”, but I can’t get fast because each command takes like a minute to finish.
some of the tabs would take a few seconds to actually make the blind@linux: prompt, so it wasn’t a perfect fast succession. But well.
I stopped the video, thinking it would just freeze. My computer was responsive at first. After a few minutes, it all froze, the connection went “?” again. A few minutes later, it mostly unfroze. The commands started finishing gradually, took about 10-15 minutes for all of them to finish. Here are the results of the last one (the rest of them are pretty similar):

1048576+0 records in
1048576+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 1215,94 s, 3,5 MB/s

I really hope that doing all those commands at the same time didn’t break anything about my system. Please let me know if that’s the case.

8388608+0 records in
8388608+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 199,936 s, 21,5 MB/s

ASUS X550IK
more info just in case:
CPU: AMD FX-9830P RADEON R7, 12 COMPUTE CORES 4C+8G (4) @ 3.00 GHz
GPU 1: AMD Radeon RX 560 Series [Discrete]
GPU 2: AMD Radeon R7 Graphics [Integrated]

einer · December 11, 2024, 11:00pm

Hi Blind,

1048576+0 records in
1048576+0 records out
4294967296 bytes (4,3 GB, 4,0 GiB) copied, 29,2285 s, 147 MB/s

Ok, this tells me that the maximum transfer rate is likely 1.47Gbps bus to buffer

Yeah, you can’t do a good test running the writes in parallel … but have enough info to give at least a good guess …
So , the disk is not very fast 21MBps read and about 5MBps ~ 10MBps write

The “stalls” are likely being caused by a combination of things:

slow disk — this manifests as a system wide stall when sever – everything is blocked waiting for I/O
not much RAM — this makes things even worse – writing to swap
CPU not exactly a rocket-ship … and because of #1 and #2 CPU is consumed with spin-locks / blocked waiting for the I/O to disk …

I am willing to bet that while you were running these small tests that the CPU was maxed and the I/O Wait (wa) was also quite hi …

Oddly enough … I have the same laptop (the one I usually lend to friends when theirs go kaput for some reason) … It’s not very powerful to begin with
But, it’s reliable and does not overheat when loaded up with work … on the down side, like yours, it stalls a lot when loaded up with a lot of open browser windows … or multiple applications running concurrently …

What can you do without buying a new machine?

put and SSD in it – this will help with the slow program loads
See if you have one open memory slot … mine has just one so … you could add 8GB~16GB of RAM — this will keep you from using swap when you have a lot of concurrent apps running

This will help “some” but it is still gong to be a relatively slow machine as compared to something a couple of years newer …

Good news though … your HDD is healthy

augenauf · December 11, 2024, 11:03pm

Your CPU is definitely under high load.

(It could be caused by processes waiting for I/O, disk or network).
Best would be to identify which process is taking up that much CPU queue time.

blind-confused · December 11, 2024, 11:12pm

hmm… shouldn’t we count the first result though (before all the simultaneous commands)?
because it said 29,2285 s, 147 MB/s

well, perhaps? I did mention that I had Discord and Firefox open, with a video playing, and well, this forum open. If all this counts, then yeah.

yeah, I guess so. I wasn’t expecting A LOT from it. Just sometimes the slowness really seemed out of place. Sometimes it would take the terminal a few minutes to start (after a fresh computer start), and a regular sudo dnf upgrade --refresh would take several minutes to just list the packages to ypdate and prompt me “Is this ok? [yes/no]”. Again, both those things would work much faster first few days after Fedora install. Everything would, in fact, be much more responsive and fast. But after 2 weeks, all this. It just feels weird. And sort of impacts my day-to-day computer usage sometimes

yeah… perhaps someday, if I ever get an option to do that. Same about more RAM.

are you completely sure? Taking a look at every parameter from the SMART scan?
(sorry, I guess I just got tired from reinstalling my system, and don’t wanna do that again for a while)

also… you still didn’t say. Could anything about my system break, after running those commands simultaneously?

what would be the correct way to do that?

einer · December 11, 2024, 11:17pm

Hi Blind,

No, this will not break/damage your machine …UNLESS it overheats … and that machine you have is pretty good about staying cool under heavey load unless you plug/cover the fan vents …

blind-confused · December 11, 2024, 11:18pm

good to know, thanks
and thank you a lot for helping me in general

blind-confused · December 11, 2024, 11:32pm

@einer sorry, I also forgot one question. Someone was suggesting defragmentation of the disk. Would that help? Especially considering I still have a Windows 10 partition on this drive, which idk if I ever defragmented (but I don’t use it anyway).
or does a newly installed Fedora need no fragmentation at all?

einer · December 11, 2024, 11:37pm

Fedora/Linux don’t generally have disk fragmentation issues … and defragging Windows will have no effect on Linux/Fedora UNLESS you are running Fedora/Linux as a virtual machine within Windows, with a virtual disk on a Windows filesystem …

einer · December 12, 2024, 1:58am

Hi Stephen,

The idea behind using /dev/zero is that it takes minimal CPU where /dev/random or /dev/urandom goes after junk in RAM … either will work and produce about the same results IF the CPU and RAM are pushing data to the disk at about the same rate
Where /dev/urandom and /dev/random are most fun is when you are pushing the data through a compressor … you get a bit more real world results vs /dev/zero

einer · December 12, 2024, 8:12am

Hi Stephen,

All good points … that’s why I try to remember to qualify this type of test as “quick-and-dirty” … the object is to get a general idea of what the device can do. Also, going from one storage device to another adds the limitation of the device being read may not be capable of saturating the writing device … so, not such a good test …
Are there better more comprehensive tests that can be done? … you bet!
Are those tests as available as using dd? … not usually

augenauf · December 12, 2024, 1:13pm

top

and/or

htop

and Gnome System Monitor show you CPU utilization by process.

einer · December 13, 2024, 6:44pm

Hi Stephen,

Well, over the past 30+ years I have found the dd way of doing test with I/O to be pretty consistent and fairly accurate. So, I am going to stay with what I “Know” works … Thanks

Topic		Replies	Views
Disk is likely to fail soon..... Problem Ask Fedora	8	3828	December 7, 2020
F32 grinds to a halt, slowly, usually when away, every 3-4 days Ask Fedora f32	3	323	May 30, 2020
Secondary drive slower than usual, journalctl output included Ask Fedora help , storage , hardware , cinnamon , desktop , f39	4	66	July 21, 2024
HDD really space doesn't show correctly Ask Fedora f30	1	296	September 29, 2019
SSD "Time out waiting for device" Ask Fedora kde , f41	2	40	April 9, 2025

A question about a faulty disk

Related topics