I am having a hard btrfs failure. The symptom started when my laptop locked for about 12 hours, while downloading, compiling, and managing a ton of chromium/firefox tabs.
In situations like this I usually just hit the hard reset, and continue on my way. This usually works, and I attempted this. However on reboot the system dropped me down to the emergency console, saying that the btrfs partition with everything on it could not be mounted.
I have a Fedora 40 KDE thumbdrive handy, and get the following set of errors when running almost any btrfs command:
Opening filesystem to check…
liveuser@localhost-live:~$ sudo mount /dev/nvme0n1p3 /mnt/orig
mount: /mnt/orig: can't read superblock on /dev/nvme0n1p3.
dmesg(1) may have more information after failed mount system call.
[23013.297352] BTRFS: device label fedora devid 1 transid 420465 /dev/nvme0n1p3 scanned by pool-udisksd (14419)
[23013.301458] BTRFS info (device nvme0n1p3): first mount of filesystem e40e2cfb-83f5-48ff-a481-bf9f0cc22543
[23013.301475] BTRFS info (device nvme0n1p3): using crc32c (crc32c-intel) checksum algorithm
[23013.301479] BTRFS info (device nvme0n1p3): using free-space-tree
[23013.303686] BTRFS error (device nvme0n1p3): bad tree block start, mirror 1 want 823396630528 have 0
[23013.303789] BTRFS error (device nvme0n1p3): bad tree block start, mirror 2 want 823396630528 have 0
[23013.303800] BTRFS warning (device nvme0n1p3): couldn't read tree root
[23013.304150] BTRFS error (device nvme0n1p3): open_ctree failed
[27001.727157] BTRFS: device label fedora devid 1 transid 420465 /dev/nvme0n1p3 scanned by mount (16522)
[27001.728732] BTRFS info (device nvme0n1p3): first mount of filesystem e40e2cfb-83f5-48ff-a481-bf9f0cc22543
[27001.728745] BTRFS info (device nvme0n1p3): using crc32c (crc32c-intel) checksum algorithm
[27001.728748] BTRFS info (device nvme0n1p3): using free-space-tree
[27001.729786] BTRFS error (device nvme0n1p3): bad tree block start, mirror 1 want 823396630528 have 0
[27001.729887] BTRFS error (device nvme0n1p3): bad tree block start, mirror 2 want 823396630528 have 0
[27001.729891] BTRFS warning (device nvme0n1p3): couldn't read tree root
[27001.730126] BTRFS error (device nvme0n1p3): open_ctree failed
checksum verify failed on 823396630528 wanted 0x02000000 found 0xabe960d0
checksum verify failed on 823396630528 wanted 0x00000000 found 0x8b095422
checksum verify failed on 823396630528 wanted 0x02000000 found 0xabe960d0
bad tree block 823396630528, bytenr mismatch, want=823396630528, have=0
Couldn't read tree root
ERROR: cannot open file system
I spent a good day looking for commands to find out the issue, and solve it to no avail.
Following is a series of btrfs commands that do not seem to give me anything I can use:
checksum verify failed on 823396630528 wanted 0x02000000 found 0xabe960d0
checksum verify failed on 823396630528 wanted 0x00000000 found 0x8b095422
Couldn't read tree root
ERROR: could not open ctree
liveuser@localhost-live:~$ sudo btrfs rescue super-recover /dev/nvme0n1p3
sudo btrfs rescue super-recover /dev/nvme0n1p3
All supers are valid, no need to recover
liveuser@localhost-live:~$ sudo btrfs inspect-internal rootid /dev/nvme0n1p3
ERROR: not a btrfs filesystem: /dev/nvme0n1p3
liveuser@localhost-live:~$ sudo btrfs-find-root /dev/nvme0n1p3
Couldn't read tree root
Superblock thinks the generation is 420465
Superblock thinks the level is 0
Well block 822913368064(gen: 419921 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822850994176(gen: 419920 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822769500160(gen: 419917 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822661070848(gen: 419903 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822652043264(gen: 419902 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822608560128(gen: 419901 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822536175616(gen: 419890 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 822557655040(gen: 419889 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 335032057856(gen: 419888 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334896906240(gen: 419873 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334756773888(gen: 419858 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334651310080(gen: 419853 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334631600128(gen: 419852 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334524530688(gen: 419839 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334506164224(gen: 419838 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334489812992(gen: 419837 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334387347456(gen: 419826 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334370078720(gen: 419825 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334333640704(gen: 419824 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334192967680(gen: 419821 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334104657920(gen: 419818 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
Well block 334067269632(gen: 419817 level: 0) seems good, but generation/level doesn't match, want gen: 420465 level: 0
<< snip 190 more entries>>
liveuser@localhost-live:~$ sudo btrfs check /dev/nvme0n1p3
Opening filesystem to check...
checksum verify failed on 823396630528 wanted 0x02000000 found 0xabe960d0
checksum verify failed on 823396630528 wanted 0x00000000 found 0x8b095422
checksum verify failed on 823396630528 wanted 0x02000000 found 0xabe960d0
bad tree block 823396630528, bytenr mismatch, want=823396630528, have=0
Couldn't read tree root
ERROR: cannot open file system
Any time I try to do anything to recover, I only get the above set of “Couldn’t read tree root” errors.
Checking for hard disk failure on the NVME shows nothing:
liveuser@localhost-live:~$ sudo smartctl -x /dev/nvme0
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.5-301.fc40.x86_64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: SOLIDIGM SSDPFKKW020X7
Serial Number: SSC6N492010506M4K
Firmware Version: 001C
PCI Vendor/Subsystem ID: 0x025e
IEEE OUI Identifier: 0xace42e
Controller ID: 0
NVMe Version: 1.4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 2,048,408,248,320 [2.04 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: aca32f 036500880f
Local Time is: Wed May 1 16:23:33 2024 EDT
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 86 Celsius
Critical Comp. Temp. Threshold: 87 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 7.50W - - 0 0 0 0 5 305
1 + 3.9000W - - 1 1 1 1 30 330
2 + 1.5000W - - 2 2 2 2 100 400
3 - 0.0500W - - 3 3 3 3 500 1500
4 - 0.0050W - - 4 4 4 4 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 42 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 385,133,057 [197 TB]
Data Units Written: 20,969,617 [10.7 TB]
Host Read Commands: 2,179,118,293
Host Write Commands: 434,322,982
Controller Busy Time: 30,833
Power Cycles: 32
Power On Hours: 3,786
Unsafe Shutdowns: 18
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 36 Celsius
Temperature Sensor 2: 37 Celsius
Thermal Temp. 1 Transition Count: 5
Thermal Temp. 1 Total Time: 508
Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged
Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
No Self-tests Logged
I do not have a backup of critical files (about 10Gb) on this system. There is also about 1Tb of data I would like to keep in place. Is there a way to rebuild the missing file tree? Can I copy (part of) the file system off the drive?