Bad performance in mechanical disks

Hello. This is my first post. I am a new Linux and Fedora user.

TL:DR
I am having a very bad performance in Fedora 42. Please help me diagnose it.

A few months ago the whole W11 thing finally pushed me out of MS ecosystem. I looked around and selected KDE Fedora as my distro.
Everything looks nice (sometimes the functionality is rough around the edges, but that is fine). The performance, on the other hand, is abysmal in my system. I will use W10 as a benchmark, not to disparage, but because that is the only comparison I have access to.
The first week using the system was hell (by modern standards). Every single operation took ages. My HDDs worked non-stop. Investigating with System Monitor I disabled the file indexer and things became manageable. The performance improved a lot, but it is still VERY slow.
Comparing boot times and applications load times (for programs that I have in both W10 and F42 like LibreOffice, GIMP, Firefox, Audacity) the f42 apps (RPM, FlatPak and/or AppImage) all take 2x to 6x to open. And the 2x case is GIMP with a ton of plugins in W10 and none in F42.
When actually using the programs the performance is fine, except when there is I/O involved, so I think it might be related to how F42 is managing my HDDs.
I have an all HDD setup and I was hoping to buy a NVMe to try and solve this, but I have read lots of people saying that in Linux we should try to diagnose first instead of buying new hardware, so here I am.
I am not trying to achieve SSD performance on HDDs, I am just aiming for a W10-like level of performance if possible.
Is F42 optimized to use SSDs? Any configuration I should have done? I have tried to reduce the swappiness of the system and disable SELinux, but those did not produce significant gains like disabling indexing did.
I have seen reports of slowdowns in F42 for certain users, but that looked like an AMD problem.

Operating System: Fedora Linux 42
KDE Plasma Version: 6.4.5
KDE Frameworks Version: 6.19.0
Qt Version: 6.9.2
Kernel Version: 6.17.4-200.fc42.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 6 × Intel® Core™ i5-8400 CPU @ 2.80GHz
Memory: 16 GiB of RAM (15,5 GiB usable)
Graphics Processor: Intel® UHD Graphics 630
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: H310M S2H

Model Family: Seagate BarraCuda 3.5 (SMR)
Device Model: ST8000DM004-2CX188
Rotation Rate: 5425 rpm

Model Family: Seagate Exos X16
Device Model: ST14000NM001G-2KJ103
Rotation Rate: 7200 rpm

Model Family: Seagate Desktop HDD.15
Device Model: ST4000DM000-2AE166
Rotation Rate: 5980 rpm

Filesystem Size Used Avail Use% Mounted on
/dev/sdd3 3,7T 3,0T 693G 82% /
devtmpfs 7,8G 0 7,8G 0% /dev
tmpfs 7,8G 12K 7,8G 1% /dev/shm
tmpfs 3,1G 1,7M 3,1G 1% /run
tmpfs 7,8G 4,2M 7,8G 1% /tmp
/dev/sdd2 974M 403M 504M 45% /boot
/dev/sdd1 599M 20M 580M 4% /boot/efi
/dev/sdd3 3,7T 3,0T 693G 82% /home

I am new to this whole Linux way of doing things so if I needed to include something to help you guys help me, please ask.

Try using ext4 filesystem instead of the default btrfs.

3 Likes

You can install and use atop to see just how busy your disks are - it defaults to a wall of text, which tells you much of what’s going on but for an overview hit B.

For a look at what is causing this IO you can use iostat (sudo iostat -ao seems to be most useful for this). That will show you which processes are doing what to which filesystems.

Get them installed, as I don’t believe they are installed by default, and see if anything leaps out at you as being particularly egregious. I understand you may be unfamiliar with them but if you want to know what kwin is for example, just shout.

I can post a snapshot of my system which is entirely nvme for comparison purposes if you wish - not that it’ll improve your scenario much more than give you an idea if the upgrade to solid state from spinning rust is worth it (it is).

1 Like

Any SSD is going to be great, the oldest, slowest, smallest you can get is a worthwhile investment - for any OS.

As Leigh said, ext4 may work much better than btrfs. You have to reinstall to do that, you will find the options in the installer.

2 Likes

@theprogram @leigh123linux

I thought of that before installing, but the benchmarks I saw did not reflect such disparate speeds as I am getting. I went with BTRFS because it is the default and it seems to be better against bit rot.

I thought / was in EXT4 and /home was in BTRFS, but looking more closely now I see that only /boot is EXT4.

I will insist a bit with configuring things, but if I only hit dead ends, the next install, possibly NVMe (I am a bit unsure about its long term durability) I will try to install the system in EXT4 and make the home BTRFS.

Thanks for the input.

@anothermindbomb

Thanks. I will give them a spin when I get home.

NVMe drives are really good. I use only them these days, no problems after 3 years, I would expect 10+ years unless they are in some kind of extreme server environment.

I have SSDs over 10 years old.

1 Like

NVME are very good for typical user workloads, but do not last in workloads that repeatedly fill the drive with new content. On servers that happens with things like image processing where new data are processed, then moved to long-term storage and replaced with a new batch. On workstations, distro hopping or just reinstalling every time Windows screws up wears out SSD’s. I use SSD’s for workstations and use spinning disks for backups.

1 Like

@anothermindbomb

atop disk usage varies between 0 and 33% when “idle” and from 70% to 100% when opening a program.

I could not find the information you asked with iostat. The options you indicated were not available. The default and extended (-x) information were:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          62,77    0,00    2,13    0,75    0,00   34,35

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
sda               0,01         0,55         0,00         0,00      79256          1          0
sdb               0,00         0,06         0,00         0,00       9024          0          0
sdc               0,00         0,06         0,00         0,00       9121          0          0
sdd               7,50       389,71       583,05         0,00   55860960   83575377          0
zram0             0,00         0,02         0,00         0,00       3272         80          0

Device            r/s     rkB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wkB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dkB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sdd              1,74    389,03     0,12   6,28   18,10   223,36    5,79    582,84     1,36  19,07   16,84   100,69    0,00      0,00     0,00   0,00    0,00     0,00    0,25   48,47    0,14   3,01

Typo on my behalf - I meant iotop -oa.

I had iostat on my mind as I also use it and muscle memory took over but iotop should look something akin to this:

  Total DISK READ:    0.00 B   ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ |   Total DISK WRITE:  280.00 K   ⣿⣿⣤⣤⣤⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
Current DISK READ:    0.00 B/s ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ | Current DISK WRITE:    0.00 B/s ⠀⢀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
TID     PRIO USER      DISK READ   DISK WRITE  GRAPH[R+W]▽                                                  COMMAND                                                                                                                                                                                       [T](18:28:51)
  660   be/4 root         0.00 B     60.00 K   ⠀⠀⠀⠀⠀⢰⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀  systemd-journald                                                                                                                                                                                         ▲
14840   be/4 root         0.00 B     32.00 K   ⠀⠀⠀⠀⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀  kworker/u65:50-flush-btrfs-1                                                                                                                                                                             █
18869   be/4 root         0.00 B     16.00 K   ⠀⢀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀  kworker/u65:38-flush-btrfs-1                                                                                                                                                                             █
 1560   be/4 root         0.00 B     16.00 K   ⠀⠀⠀⠀⠀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ╭rsyslogd                                                                                                                                                                                                 █
 1565   be/4 root         0.00 B      4.00 K   ⠀⠀⠀⠀⠀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ╰rsyslogd                                                                                                                                                                                                 █
 3110   be/4 steve        0.00 B    100.00 K   ⠀⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ╭zen                                                                                                                                                                                                      █
 3034   be/4 steve        0.00 B     32.00 K   ⠀⢠⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ┊zen                                                                                                                                                                                                      █
 3006   be/4 steve        0.00 B     20.00 K   ⠀⠀⠀⠀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ╰zen

You’ll see that graph in the middle is a rolling display of what used what, when. It’ll show you which specific threads in which programs were causing that load on the disk, which you’ve already ascertained ticks over at (say) 15% and spikes to flat out when you load code, as you’d expect. With SSD or nvme drives that spike is very small over time. Same work just compressed into a smaller window meaning other stuff gets a chance to hog the drive for an equally small time slice.

If you can afford to, I’d treat yourself and drop in an nvme (if you have the slots) or an SSD - fairly confident you won’t regret the outlay.

1 Like

I full zero/discard my NVMes twice before OS reinstalls, probably 1-2 times a week for years :stuck_out_tongue: I’m thinking the average user isn’t doing anything that intensive on SSDs long-term.

3 Likes

The worst SSDs have something like 500,000 write cycles.

I did have a 1TB Samsung Evo 970 die a death with about 2 years of lifetime (and about 60 hours of use).

I was using it in Linux at the time died - nothing complained - - things just ground to a halt - no response from opening programs, web pages timing out, just general weird behaviour. I assumed it was a kernel issue, rebooted, and the drive was never seen again by the BIOS. Out of warranty too, much to my irritation.

I could understand if it had been hammered and was a cheap “no-name” of dubious providence, but it was primarily “plugged in but not used” as it was an alternative drive with Qubes installed, which I rarely booted. As such, it was two years old but had barely been used.

Replaced it with a 2TB Samsung which is in use as I type… Going back to spinning rust would be inconceivable.

I have an SSD I use for Fedora. Purchased in october 23, installed Fedora 38.
Model Family: Crucial/Micron Client SSDs
Device Model: CT500MX500SSD1

After 2 years, 74% lifetime remaining. Usage, browsing the net mostly. Browsers cache is in tmpfs. File indexing disabled. Ext4 instead of btrfs for 2 months.

$ sudo smartctl --all /dev/sda |grep Percent_Lifetime_Remain
202 Percent_Lifetime_Remain 0x0030   074   074   001    Old_age   Offline      -       26

$ sudo hdparm -t --direct /dev/sda

/dev/sda:
 Timing O_DIRECT disk reads: 1274 MB in  3.00 seconds = 424.56 MB/sec

I also have an nvme drive for Debian. It could be twice as fast, but not with that old motherboard.

Modèle de disque : CT500P3SSD8

$ sudo hdparm -t --direct /dev/nvme0n1

/dev/nvme0n1:
 Timing O_DIRECT disk reads: 1638 MB in  3.00 seconds = 546.00 MB/sec

That drive is SMR (Shingled Magnetic Recording) technology and will greatly degrade over a very short time due to the way data is written and read.

ST14000NM001G
This drive is an enterprise drive and though I could not locate a datasheet showing the recording technology used I would suspect it is CMR (Cylindrical Magnetic Recording) which does not degrade over time.

ST4000DM000
This drive is a Barracuda drive (just like the ST8000DM004) and though I could not find a datasheet for it, my experience with Seagate is that almost all Barracuda drives use SMR recording with the results noted above.

The difference is that with SMR the very first write in a sector is fine. The shingling means that the same sector can contain layers of data, so the second write in the same sector now requires reading the first layer, writing the second layer, then rewriting the first layer (shingled). This repeats for additional layers in the same sector and there are usually 3 or 4 layers involved for each sector. Writing thus degrades enormously as the drive begins to get more than about 1/3 of capacity (or sooner)

CMR (Cylindrical Magnetic Recording) on the other hand is as it states - cylindrical. There is no overlap of data within a sector so no delays in writing.

In your case, if you installed fedora on one of the drives that use SMR (sdd, which appears to be the ST4000DM000 (Barracuda)) and used btrfs – the fact that btrfs uses CoW (copy on write) for each write means the drive function will slow down drastically and quickly since the file system does a large amount of writing. Your reported usage of that drive partition for / and /home is already at 82% so it becomes even more degrading at present.

My suggestion, in order to continue using those HDDs you already have, would be to do one of 2 things. Either install fedora on the ST14000NM001G drive to avoid the SMR issues, or as you already considered, get an SSD (preferably an M.2 NVME drive) and install fedora on it. Most modern SSDs have a proven history of reliability, unlike the earlier versions that did not have the tech improvements in place today.

The one I use on my daily driver (though there are many choices) is

Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: CT2000P5PSSD8                           
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
5 Likes

@anothermindbomb

Nothing egregious comes up. Just the normal processes for the programs/KDE/Wayland as far as it seems. There are constant spurts of btrfs-transaction even when idle that I suspect are normal, but I do not really know.

TID     PRIO USER      DISK READ   DISK WRITE  GRAPH[R+W]▽   COMMAND      
433     be/4 root      5,25 M      48,05 M   ⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀btrfs-transaction                            

@computersavvy

That was very in-depth. Thanks a lot.

So if I surmised it correctly, the combination of SMR and BTRFS is not a very good one.
I think I will buy an NVMe PCIe adapter (all SATAs occupied) and a NVMe drive to host the system.
Do you think it advisable to keep those drives (as data drives, not system) in BTRFS or change it to other FS? I am mainly concerned with durability now. Does BTRFS negatively impact drives lifespan?

Not as btrfs in my opinion.
Those drives with SMR tech may work well as data storage (read mostly), but do not work well with frequent writing. They work reasonably well with an ext4 file system which does not use CoW to copy with every write to the drive.

2 Likes

Thanks, Jeff, for actually giving some context to the ext4 recommendation. To elaborate, the issue isn’t that BTRFS is a bad filesystem (though there are threads here where people claim that), it’s that BTRFS is a CoW (copy on write) filesystem. This means that changes to a file requires reading the old data and writing it somewhere else. And mechanical harddisks aren’t suitable for this kind of workload, the constant seeking severely hurts performance. And this is made even worse with SMR.

3 Likes

If your motherboard has an M.2 PCIe slot this would not be required.

Another option, if you have no M.2 PCIe slot, would be to get an nvme SSD with a usb enclosure and boot from that usb device.

If you use a PCIe adapter with M.2 slot it then is also possible that the nvme device may not be accessible for booting since any drivers needed would not be loaded until the system was at least partway into the boot sequence.

It may be better to use a PCIe SATA card and move some of the SATA devices to that card instead of the NVME.

My motherboard has 6 sata ports and 2 M.2 PCIe slots plus one M.2 SATA slot (which would take up 2 of the 6 sata ports and remains unused).