I booted up today and was directly launched in to the emergency shell.
My Device is a Tuxedo Pulse 14, ryzen with onboard graphics,
Fedora 41 Kernel 6.15.4
I’m attachned the rdosreport, but I do not get a clue from that: rdosreport - Pastebin.com
Looking into journalctl the most likely cause is this error:
BTRFS error (device dm-0 state E): open_ctree failed: -5
…
Failed to mount sysroot.mount - sysroot
So I guess I somehow need to fix the partition then? Any suggestions on how to proceed?
This command will clear the filesystem log tree. This may fix a specific set of problem when the filesystem mount fails due to the log replay. See below for sample stack traces that may show up in system log.
The common case where this happens was fixed a long time ago, so it is unlikely that you will see this particular problem, but the command is kept around.
Note
Clearing the log may lead to loss of changes that were made since the last transaction commit. This may be up to 30 seconds (default commit period) or less if the commit was implied by other filesystem activity.
One can determine whether zero-log is needed according to the kernel backtrace:
If the errors are like above, then zero-log should be used to clear the log and the filesystem may be mounted normally again. The keywords to look for are ‘open_ctree’ which says that it’s during mount and function names that contain replay, recover or log_tree.
On which device should I apply this? How can I determine this from the error message?
btrfs rescue zero-log /dev/disk/by-uuid/d5705694-cd96-4560-a6f3-54cfd2fdc3b3
This seems logical to how I understand the log, but I cannot apply it since the resource is busy:
ERROR cannot open device … resource busy
ERROR could not open ctree
So dm-0 is nothing the system can work with. I’m now trying to execute zero-log from a live system, maybe this will do the trick. Googling indicated that if zero-log errors out in dracut-emergency shell, the provlem might be so severe, that I can only fix it from another system.
btrfs rescue zero-log “path to device” did the trick.
In my case this did not work from the dracut emergency shell. I created a fedora workstation live usb stick, booted into it, unlocked my luks device and executed
Fedora 39 has been EOL for about a year, and AFAIK the 6.15.4 kernel was never released for that version. Are you actually using f39? or are you using f41 or f42 ? (both of which are still supported and both of which have the 6.15.4 kernels)
I’m confused. At the top of this discussion it says kernel 6.15.4, but the pastebin shows 6.5.6 which is quite old by Fedora standards. So I’d like to understand if it was maybe 6.15.4 doing the writes and 6.5.6 was used to try to do the recovery?
We need a 6.15 series kernel to do a normal mount, and see the dmesg for that failure (if it fails to mount) and get this to upstream developers.
I understand folks want to help every day ordinary users get up and running again. However the only way we can understand if this is a one off, or a kernel bug, is to get a complete dmesg of the failure to mount. Ideally we’d also get the dmesg at the time the writes were happening…but that is hard in this case because the only reason I can think of why we see a failure to replay the tree log is a) the file system was actively writing, with fsync; b) there was a power failure or a crash at the time.
The existence of the log tree is evidence of both. I’m not sure how else it happens.
We need to gather as much information as possible before zeroing the evidence of the problem. Thanks.
It can be drive firmware bugs. But also there’s btrfs bugs. The tree log code is complex. And all it’s there for is to make fsync faster. Literally instead of writing a bunch of metadata, the metadata writes are delayed for performance, and a smaller amount of writes go into a subvolume specific btree. Normally the writes complete and everything calms down, Btrfs writes the full metadata, then deletes the tree log btree. But if there is a crash before this, the next mount, the kernel sees the log tree and replays it. Ordinarily it works and the metadata is now written in the correct/persistent location.
But if the log is corrupt, didn’t get written correctly (bugs) then it can’t be replayed and by default the file system doesn’t mount because it does mean some data loss of the last seconds of whatever was writing and fsyncing.
But unlike the journal on journaled file systems, on Btrfs we don’t need to replay the log in order to make the file system consistent. Just to not lose the last seconds of writes - the whole point of fsync is saying these are important writes, we need to flush the file system and write them out in the proper order, etc. But unlike on XFS zero log, which is not a good idea, you really need to find a kernel that will replay it, or else you’ll need to do an xfs_repair, Btrfs doesn’t care if the tree log is dropped. File system is OK, but some files will be lost.
Can this affect older files or just last-minute unsaved modifications and new files?
Is there a way to identify the lost files to make sure this is not something important?
Assuming that we have set up regular backups, but may not be aware about the lost files until these files are actually needed at some point later.