This has happened very frequently on both ext4 & btfs.
Ext4-fs error (device nvme0n1p4): _ext4_find_entry:1524:inode #130563: comm thermald: reading directory 1block0
This has happened very frequently on both ext4 & btfs.
Ext4-fs error (device nvme0n1p4): _ext4_find_entry:1524:inode #130563: comm thermald: reading directory 1block0
Maybe the nvme drive is going bad. Does the following command show any interesting results?
$ sudo smartctl -x /dev/nvme0n1
It doesn’t appear the drive is going bad, and below is the output from running smartctl:
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 980 PRO 500GB
Serial Number: S5NYNG0NA02654A
Firmware Version: 1B2QGXA7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 500,107,862,016 [500 GB]
Unallocated NVM Capacity: 0
Controller ID: 6
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 500,107,862,016 [500 GB]
Namespace 1 Utilization: 70,020,763,648 [70.0 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 ba01501988
Local Time is: Mon May 10 15:30:21 2021 EDT
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size: 128 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 8.49W - - 0 0 0 0 0 0
1 + 4.48W - - 1 1 1 1 0 200
2 + 3.18W - - 2 2 2 2 0 1000
3 - 0.0400W - - 3 3 3 3 2000 1200
4 - 0.0050W - - 4 4 4 4 500 9500
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 33 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 676,804 [346 GB]
Data Units Written: 1,384,027 [708 GB]
Host Read Commands: 5,720,123
Host Write Commands: 6,824,250
Controller Busy Time: 19
Power Cycles: 352
Power On Hours: 17
Unsafe Shutdowns: 143
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 33 Celsius
Temperature Sensor 2: 38 Celsius
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged
Another possibility is that some software is accessing the device directly, bypassing the file system drivers, and causing file system corruption. Do you know of anything that might be accessing /dev/nvme0n1 or /dev/nvme0n1p4 directly? I think it is extremely rare, but I think some boot loaders can attempt to write directly to the drive and cause file system corruption. I’ve seen this with grub and zfs, but I don’t think I’ve every heard of it happening with btrfs. If you have a dual boot setup and you are using a non-Fedora bootloader, that might be a remote possibility. Other than that, I don’t know. Other than bad hardware; maybe a bad cable could do it. I’m not sure.
I appreciate your ideas. The only software I’ve installed has been Putty and lftp. I started with Fedora 33/gnome and then recently updated to 34. I am thinking of swapping SSD’s, maybe going smaller to a 256 G.