Kernel 6.10.9 Causes System to Boot to Read-Only Mode for BTRFS

I’m not super sure where to begin here, but my system (running Fedora 40 KDE Spin) took the update to kernel version 6.10.9 via Discover a few days ago, and I haven’t been able to get it to run successfully. Booting back to 6.10.8 works perfectly fine, so I’ve just been using that. After reinstalling the 6.10.9 kernel, I was able to make it to the desktop without any (visible) errors, but the filesystem is in read-only mode, so pretty much everything is broken and/or completely unusable.

Because the filesystem is in read-only, I can’t save any logs or anything, but I was able to run dmesg, take a photo of the output, and extract the text with my phone. So apologies if this is a bit messy, but here’s what appears to be the relevant stuff:

[ 36.299250] memcg:ffff8d618094e000

[36.299253] aops:btree_aops ino:1

[36.299259] flags: 0x17ffffc0004000 (private|node=0|zone=2|lastcpupid=0x1fffff)

[36.299265] raw: 0017ffffc0004000 000DDDDDDDDDD000 dead000000000122 ffff8d61940f9458 [

36.299268] raw: 0000000001845098 ffff8d62347082d0 00000004ffffffffffff8d618094e000 36.299270] page dumped because: eb page dump

[ [36.299273] BTRFS critical (device nvme0n1p6): corrupt leaf: block=104237465600 slot=179 extent bytenr=7010725888 len=4096 invalid data ref objectid value 45883135623424 device nvme0n1p6): read time tree block corruption detected on logical 104237465600 mirror 1

[ 36.299281] BTRFS error ( [36.299465] page: refcount: 4 mapcount:0 mapping: 00000000130170d2 index: 0x1845098 pfn:0x2981fb

[36.299469] memcg:ffff8d618094e000 36.299471] aops:btree_aops ino:1

[

[ flags: 0x17ffffc0004000(private|node=0|zone=2|lastcpupid=0x1fffff) 36.299478] raw: 0017ffffc0004000 0000000000000000 dead000000000122 ffff8d61940f9458

[ 36.299474]

[ 36.299481] raw: 0000000001845098 ffff8d62347082d0 00000004ffffffffffff8d618094e000

36.299482] page dumped because: eb page dump

[36.299484] BTRFS critical (device nvme0n1p6): corrupt leaf: block=104237465600 slot=179 extent bytenr=7010725888 len-4096 invalid data ref objectid value 45883135623424

36.299504] BTRFS error (device nvme0n1p6 state A): Transaction aborted (error -5)

[36.299507] BTRFS: error (device nvme0n1p6 state A) in _btrfs_free_extent: 3219: errno=-5 10 failure

[36.299491] BTRFS error (device nvme0n1p6): read time tree block corruption detected on logical 104237465600 mirror 2 [

[36.299512] BTRFS info (device nvme@n1p6 state EA): forced readonly

] BTRFS error (device nvme0n1p6 state EA): failed to run delayed ref for logical 7010590720 num_bytes 4096 type 178 action 2 ref_mod 1: -5 btrfs_run_delayed_refs:2207: errno=-5 10 failure

[ 36.299515 [ 36.299520] BTRFS: error (device nvme0n1p6 state EA) in

[36,308722] systemd-journald[797]: /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal: Journal file corrupted, rotating. [36.308749] systemd-journald[797]: Failed to rotate /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal: Read-only file system

[ 36.308760] systemd-journald [797]: Failed to rotate /var/log/journal/86edf9fd07424419be7946a5cb27feca/user-1000.journal: Read-only file system [36.313886] systemd-journald[797]: Failed to write entry to /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal (10 items, 339 bytes) despite vacuuming, ignoring: Bad message

[36.313917] systemd-journald [797]: /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal: Journal file corrupted, rotating. 36,313930] systemd-journald [797]: Failed to rotate /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal: Read-only file system

[36.317197] systemd-journald[797]: /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal: Journal file corrupted, rotating. [ 38.681742] systemd-journald[797]: Failed to write entry to /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal (28 Bad

items, 767 bytes) despite vacuuming, ignoring:

message ( Dropped 14 similar message(s))

[ 39.730674] systemd-journald[797]: Failed to write entry to /var/log/journal/86edf9fd07424419be7946a5cb27feca/system.journal (21 items, 608 bytes) despite vacuuming, ignoring:

Bad message (Dropped 7 similar message(s))

Based on the errors, it seems to indicate some kind of issue with the drive, but considering it ONLY happens with this kernel version, and I can boot back into 6.10.8 and it’ll work perfectly fine (I’m typing this on it now), I am very skeptical of that being the case.

I would continue to run 6.10.8 but I know that’s not a solution forever, and I know 6.10.10 is in testing right now, but I’m also concerned this problem isn’t going to just go away with new kernel versions.

So, at this point, I’m completely stumped. I can’t seem to find anyone having the same issue online. Any ideas? Any help would be greatly appreciated.

Suggest you run sudo btrfsck --read-only to see if it finds any issues from the working kernel boot.

I have the same issue. Also, just booting into 6.10.9 caused filesystem corruption which I was fortunately able to fix after booting with an older kernel. Currently I’m on 6.10.7, no problem so far.

@chrismurphy Do you have a moment to comment on this topic?

While booted into the working kernel, right?

stadsport@fedora:~$ sudo btrfsck --readonly
[sudo] password for stadsport: 
btrfs check: exactly 1 argument expected, 0 given

Oof, good to know. I think I’ll stop booting into 6.10.9 to prevent the same from happening. Been running happily on 6.10.8, I’m just more concerned that the issue won’t be resolved by 6.10.10 or 6.11 if no one else is reporting it.

Use lsblk -f to find out the device that the btrfs is on.
Then you can issue the command like this example (my disk was /dev/sda3):

$ sudo btrfsck --readonly --force /dev/sda3
Opening filesystem to check...
WARNING: filesystem mounted, continuing because of --force
Checking filesystem on /dev/sda3
UUID: 93c96a90-cc18-4b79-a2cf-72cd79f970bf
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 8784674816 bytes used, no error found
total csum bytes: 7902772
total tree bytes: 391413760
total fs tree bytes: 352927744
total extent tree bytes: 25264128
btree space waste bytes: 80887396
file data blocks allocated: 11267547136
 referenced 14447071232

Thank you, here is the output from this check:

stadsport@fedora:~$ sudo btrfsck --readonly --force /dev/nvme0n1p6
Opening filesystem to check...
WARNING: filesystem mounted, continuing because of --force
Checking filesystem on /dev/nvme0n1p6
UUID: 1ebf9e47-73a9-4138-b305-ee9f8f395cf3
[1/7] checking root items
[2/7] checking extents
data extent[7010725888, 4096] referencer count mismatch (root 256 owner 1664566 offset 1110016) wanted 0 have 1
data extent[7010725888, 4096] bytenr mimsmatch, extent item bytenr 7010725888 file item bytenr 0
data extent[7010725888, 4096] referencer count mismatch (root 45883135623424 owner 0 offset 1110016) wanted 1 have 0
backpointer mismatch on [7010725888 4096]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
[4/7] checking fs roots
        unresolved ref dir 12447 index 130 namelen 40 name 7F8B43944C8834393A41C49938D21738EB36BFD0 filetype 1 errors 40, index mismatch
ERROR: errors found in fs roots
found 517543223296 bytes used, error(s) found
total csum bytes: 500910964
total tree bytes: 3688448000
total fs tree bytes: 2749153280
total extent tree bytes: 340705280
btree space waste bytes: 679314322
file data blocks allocated: 6814367195136
 referenced 598742298624

You have errors which may explain why the file system goes read-only.

I’m not an expert on btrfs and note that btrfsck says that the --repair option can be dangerous. You need advice from an expert, hopefully @chrismurphy may be able to help you.

I’ll update the title to include btrfs in it.

Added btrfs

Let me complete by thought. I think the new kernel has better detection of errors in the file system and that is why the old kernel does not get upset.

That makes sense, and to be honest I was completely ready to accept that there were filesystem errors, I just couldn’t find any explanation as to why it was only happening under a certain kernel version, and that would explain it. It also likely confirms my suspicions that it’s not a problem likely to go away with later versions.

I’m also hesitant to btrfsck --repair as I’ve read the same issue, so may wait for Chris to respond. If nothing else I would imagine the safest method would be to boot from a live USB and run it from there, with the filesystem unmounted.

Fixing while its mounted is a no-no!

1 Like

It will almost certainly kill the file system.

6.10.9 does have a patch btrfs: tree-checker: validate dref root and objectid, commit fbaafe4c8f7904afa7fe413d2f0f465c5b878aea

So that’s why this is being caught by 6.10.9 and not 6.10.8 or older.

Checking with upstream about the btrfs check.

File system is mounted so I’m not entirely certain the btrfs check is reliable.

Can you either boot with parameter

rd.break=pre-mount

and then at the prompt

btrfs check nvme0n1p6

(Or boot from a Fedora Live USB stick, and run that command.)

Yup understood, that’s why I said if anything I would do it from a live USB with the fs unmounted. I’ll try running btrfs check on it from a Live USB but it’ll probably have to be tonight, as this machine is my primary work computer :slight_smile:

Thank you both for your help with this, will report back with what I find

I suggest freshening backups while you can. And limit the write time use of this file system.

The purpose of the read time tree checker is to force the file system read only to prevent further confusion from making it to disk. Continuing to run the file system read write risks the problem getting worse. I don’t know how fast that will happen.

Also if you reboot with rd.break=pre-mount boot parameter (edit the boot entry in GRUB), this will permit you to run the check with the Btrfs file system unmounted. But you’ll need to take a cell phone photo of it since you won’t have copy paste.

btrfs check --mode=lowmem nvme0n1p6

The lowmem mode runs through some different code for the check (it isn’t only for lowmem situations) and is slower, but might give more relevant results. So use that and take a photo and put that up somewhere, if you can.

This is what I did after I realized there is a problem. The exact steps were:

  1. Booting into 6.10.9, the filesystem goes into readonly after a few minutes. Panic…
  2. Booting into 6.10.7 (this was the latest working kernel on my system), running btrfs check --readonly on mounted fs, it showed the same tyoe of problems you shared in your other reply (regerencer count, bytenr and whatnot)
  3. After a lot of consideration, googling and doing a backup, I booted into a livecd and did btrfs check --repair Fortunately, it could fix the corruption without causing further damage
  4. Booted into 6.10.7, the system worked without any problems
  5. Booted again into 6.10.9, the fs again changed into readonly mode after a few minutes
  6. Repeated steps 2-3, the system was good again if I used 6.10.7

Sorry for being long but this is why I think @barryascott was probably not right when he wrote in the other reply that this might be a sign of fs corruption that is not detected while booted into older kernels, since the problem gets clearly detected while booted into 6.10.7, gets corrected in LiveCD (iirc 6.8.something), and reappears again when booted into 6.10.9.

BTW, I just updated to 6.10.10 and it seems to be working fine. It is up for more than an hour now - no BTRFS related errors during boot and I’m running dmesg -W since login and shows nothing so far.