Cant update my fedora, every update makes the kernel crash =/ (asus zenbook)

Gets btrfs errors.

So i am new to fedora, so i have no clue how to fix it. I am using kernel-core-6.14.0-63.fc42.x86_64, but every time i update and reboots into the new kernel it freezes. I think it is BTRFS I/O Errors and Freeze on NVMe Drive.

So after every update i press shift when boot and select my 6.14 again and removes the new update. But there must be something i can do to fix it? I have tried all new updates, and the latest (from fodora 42 to 43, 6.17.700 i think, made it longer than the last updates but in the end it crashes like the others with the same messages.

btrfs

We would need to see the logs from the failed boot.

Can you boot the latest kernel, but remove the rhgb and quiet options from the kernel command line?




Is this what you need? That was what showed up in the end when booted without rhgb and quiet. Or should i get the whole log in text somehow?

After i rebooted, i did run journalctl -b -1 > full_failed_boot_log.txt:

Maybe this is helpful too:
— System Information —
Linux fedora 6.14.0-63.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Mar 24 19:53:37 UTC 2025 x86_64 GNU/Linux
Static hostname: (unset)
Transient hostname: fedora
Icon name: computer-laptop
Chassis: laptop :laptop:-
Machine ID: 86920b76fc6b4c1999fe0baccb3faa04
Boot ID: 18d1ee1906a84dda82894e6af85c09f0
Operating System: Fedora Linux 42 (KDE Plasma Desktop Edition)
CPE OS Name: cpe:/o:fedoraproject:fedora:42
OS Support End: Wed 2026-05-13
OS Support Remaining: 6month 3d
Kernel: Linux 6.14.0-63.fc42.x86_64
Architecture: x86-64
Hardware Vendor: ASUSTeK COMPUTER INC.
Hardware Model: ZenBook UX481FLY_UX481FL
Firmware Version: UX481FLY.302
Firmware Date: Thu 2020-07-02
Firmware Age: 5y 4month 6d
— NVMe Controller/Driver Information —
Kernel driver in use: nvme
Kernel modules: nvme
— Disk/Partition Layout —
NAME FSTYPE SIZE MODEL
loop0 4K
loop1 55,5M
loop2 55,5M
loop3 73,9M
loop4 164,8M
loop5 91,7M
loop6 84,5M
loop7 50,9M
zram0 swap 8G
nvme0n1 476,9G INTEL SSDPEKNW512G8
├─nvme0n1p1 vfat 260M
├─nvme0n1p2 ext4 1G
└─nvme0n1p3 btrfs 475,7G
— CPU/Memory Summary —
model name : Intel(R) Core™ i7-10510U CPU @ 1.80GHz
MemTotal: 16157812 kB

You might want to tag your post with btrfs and add BTRFS to the title to hail the expert.

1 Like

Thank you. I added the tag, but it seems I can’t edit the title.

@chrismurphy do those kernel logs make sense to you?
I find it odd that the old kernel is happy with the btrfs file system but the new kernel is not happy.

I see in the earlier screenshots the Btrfs device statistics counter shows 79 prior dropped writes, and increasing read errors. The cause for this is earlier in the kernel log, so we need to see earlier logs, e.g. journalctl -b-1 -k or -b -2 -k, etc.

These earlier logs might not contain what we’re looking for because if the file system has gone read-only already, it can’t write to the journal.

These are not btrfs errors per se, they are Btrfs being aware that reads and writes are being dropped from the drive - so they are errors telling us about problems in the storage stack. Since there’s no device mapper target for encryption or LVM we are left with the raw NVMe device and its driver as the possible sources of the problem.

So yeah, full unfiltered dmesg needed.

In the attached file, I do not see any error messages in that boot, which is -b-1 or the immediately previous boot. Again, depending on the file system state, if it’s not mounting read/write, it won’t be writing to the journal.

It might just be best to do something like:

# journalctl -k --since=-10d --no-hostname > journal.log

And that’ll give us the past 10 days worth of kernel only messages (none of the user space stuff).

I have uploaded journal.log in Fedora boot log - Google Drive, i think it contains what you where asking for

>nov 03 09:03:56 kernel: BTRFS info (device nvme0n1p3): start tree-log replay

This suggests a prior crash, power fail (or forced power off) during writes with fsync. The good news is the replay succeeds, there are no errors, and the file system mounts ok. There are ten of these in the log provided.

However, it seems no prior boot with the crash/power fail is in this log. Chances are it didn’t make it to persistent media because of the problem you’re having. The kernel messages related to the problem are only in volatile journal, and therefore it’s gone upon reboot.

Here’s an example from the provided log:

nov 10 08:18:07 systemd[1]: Started systemd-journald.service - Journal Service.
nov 10 09:20:38 kernel: Linux version 6.14.0-63.fc42.x86_64 (mockbuild@d5701c6d040c430c8283c8c9847dc93f) (gcc (GCC) 15.0.1 20250228 (Red Hat 15.0.1-0), GNU ld version 2.44-3.fc42) #1 SMP PREEMPT_DYNAMIC Mon Mar 24 19:53:37 UTC 2025

nov 10 09:20:40 kernel: BTRFS info (device nvme0n1p3): start tree-log replay

The first line, time stamp 08:18:07 is the last line for that boot. Nothing happens after that so we have no idea what happened. And the very next boot has tree-log replay which only happens if there is a log tree present, and log tree is only present during tree logging :slight_smile: which only happens during writes with fsync. It’s an fsync performance optimizer, and is crash safe. So it just gets replayed at next boot.

There are no btrfs or nvme problems in the log. So whatever is happening doesn’t offhand seem to be file system related.

It might be storage device related, maybe it’s wigging out at some point during the boot, the device itself goes read only (?) and therefore no more journal entries. That would be consistent with the earlier screenshot showing dropped reads and writes. The nvme device is just hung. (speculation)

What we need is the full complete text version of that screen shot. But that log is only in volatile memory. If the drive is having firmware or kernel driver issues (seems more likely this is a kernel bug since it’s working ok with 6.14 kernel) that likely prevents the journal from being written to persistent media and why we aren’t seeing the issue in the provided log - only the tree-log replay which tells us, yeah there was a crash/power fail (or the whole drive just got reset or dropped off the pci bus or … who knows)


OK so now what?

I suggest booting the new kernel (I’m sorta guessing this is the problem kernel) with this boot parameter:

rd.systemd.debug-shell=1

That will provide a root shell on tty9. So when the system hangs at boot, you should still be able to get to tty9, and use the root shell there to extract the journal for this boot so we can see what’s failing.

How do you extract the journal in this situation? The nvme drive may not be available at all here to save the journal to.

You will need some other drive, like a USB stick drive, but anything will do.

You can use blkid to find the device node for this stick, e.g. /dev/sda1 and mount it somewhere, anywhere, doesn’t really matter. It can be somewhere in /tmp or /run - those are volatile so you can’t hurt anything there, it’ll all go away at next boot anyway.

mkdir /tmp/mnt
mount /dev/sda1 /tmp/mnt
journalctl -k --no-hostname > /tmp/mnt/journal.log
umount /tmp/mnt
reboot

So let’s see if this reveals what’s going on.


Alternative ideas to also check in the meantime, since apparently my turn around time is 20+ hours.

Make sure both the logic board firmware (UEFI) and the drive firmware are up to date as provided by the system manufacturer. I just ran into another user having an issue with an nvme drive that was intermittently disappearing and a firmware update purports to fix that problem specifically.

But since it seems to cooperate with kernel 6.14 and not 6.17… we probably need to test 6.18 and possibly even 6.16 to see if it’s working in the newer kernel. And if now when it stopped working in kernels before 6.17 - or if it stopped working in 6.17.

Introducing koji build system. This is the link for kernels.

Example, kernel-6.18.0-0.rc5.44.fc44 is the current upstream 6.18-rc5 kernel. Click on that and you’ll see a list of RPMs by arch. Find x86_64 and you can download four files: kernel, kernel-core, kernel-modules, kernel-modules-core for this version. And then install them using either dnf install or rpm -iv, your choice.

It’s possible the bug is fixed in 6.18 and just hasn’t been backported yet to stable. It’s also possible it hasn’t been fixed which would be good to know now because it might mean no one has reported it yet.

Ergo, this is now a bug hunt.

It might also be a good idea to get a Red Hat Bugzilla account or a Fedora account - either can be used to log into bugzilla.redhat.com and file bugs against the kernel and provide the dmesg you hopefully capture and save to your USB stick. And then we can start asking around for a fix, most likely search upstream and if no one has reported it there, report it and it should get fixed pretty quickly once they’re aware there’s been a regression.

But they will need dmesg showing the failure.

Whew!

Once you boot with working kernel did you try sudo dnf reinstall kernel kernel-* ?

FYI Reinstall on linux is rarely needed. But happy to be proved wrong.

So i have uninstalled all other kernels, and updated again, so now i have:
torbjoern@fedora:~$ sudo dnf list installed kernel*
[sudo] lösenord för torbjoern:
Uppdaterar och laddar förråd:
Förråd laddade.
Installerade paket
kernel.x86_64 6.14.0-63.fc42 781e4eb56ba449a5876af2cc08
kernel.x86_64 6.17.7-300.fc43 <okänd>
kernel-core.x86_64 6.14.0-63.fc42 781e4eb56ba449a5876af2cc08
kernel-core.x86_64 6.17.7-300.fc43 <okänd>
kernel-devel.x86_64 6.17.7-200.fc42 <okänd>
kernel-devel.x86_64 6.17.7-300.fc43 <okänd>
kernel-headers.x86_64 6.17.4-300.fc43 <okänd>
kernel-modules.x86_64 6.14.0-63.fc42 781e4eb56ba449a5876af2cc08
kernel-modules.x86_64 6.17.7-300.fc43 <okänd>
kernel-modules-core.x86_64 6.14.0-63.fc42 781e4eb56ba449a5876af2cc08
kernel-modules-core.x86_64 6.17.7-300.fc43 <okänd>
kernel-modules-extra.x86_64 6.14.0-63.fc42 781e4eb56ba449a5876af2cc08
kernel-modules-extra.x86_64 6.17.7-300.fc43 <okänd>
kernel-tools-libs.x86_64 6.17.7-300.fc43 <okänd>

Tillgängliga paket
kernel-cross-headers.x86_64 6.17.4-300.fc43 updates
kernel-debug.x86_64 6.17.7-300.fc43 updates
kernel-debug-core.x86_64 6.17.7-300.fc43 updates
kernel-debug-devel.x86_64 6.17.7-300.fc43 updates
kernel-debug-devel-matched.x86_64 6.17.7-300.fc43 updates
kernel-debug-modules.x86_64 6.17.7-300.fc43 updates
kernel-debug-modules-core.x86_64 6.17.7-300.fc43 updates
kernel-debug-modules-extra.x86_64 6.17.7-300.fc43 updates
kernel-debug-modules-internal.x86_64 6.17.7-300.fc43 updates
kernel-debug-uki-virt.x86_64 6.17.7-300.fc43 updates
kernel-debug-uki-virt-addons.x86_64 6.17.7-300.fc43 updates
kernel-devel-matched.x86_64 6.17.7-300.fc43 updates
kernel-doc.noarch 6.17.7-300.fc43 updates
kernel-headers.i686 6.17.0-63.fc43 fedora
kernel-modules-extra-matched.x86_64 6.17.7-300.fc43 updates
kernel-modules-internal.x86_64 6.17.7-300.fc43 updates
kernel-rpm-macros.noarch 205-27.fc43 fedora
kernel-selftests-internal.x86_64 6.17.7-300.fc43 updates
kernel-srpm-macros.noarch 1.0-27.fc43 fedora
kernel-tools.x86_64 6.17.7-300.fc43 updates
kernel-tools-libs-devel.x86_64 6.17.7-300.fc43 updates
kernel-uki-virt.x86_64 6.17.7-300.fc43 updates
kernel-uki-virt-addons.x86_64 6.17.7-300.fc43 updates
kernelshark.x86_64 1:2.3.1-6.fc43 fedora
torbjoern@fedora:~$

i could not save the log from the tty9, i tried but the file was always gone after reboot. So i filmed it and saved it in the gdrive folder in map LOG, or here: https://drive.google.com/file/d/13EKXdXczqBZaGTLh5o6oCa2IUr9t7RXx/view?usp=sharing

The behaviour has changed now though, the only thing i see when pressing Esc when loading the new kernel is:

I will try again to get the text version of the LOG, but i could not mount anything from the TTY9, got responses like Special device /dev/nvme01n1p3 does not exist. Hope the video works! Thanks.

With a better camera: https://drive.google.com/file/d/146qA4gBurRREXTlH2fBdrutf_Jl3eOwr/view?usp=sharing

(excuse me for the background audio)

Why don’t you upload the log file you’re scrolling through to a pastebin site…

Boot from Live ISO, mount your drive, fpaste your log file.

this is still F42 right? Try to install older kernels and try to find first kernel that can’t boot.

You could also try 6.17.8 or 6.18 from rawhide. If 6.18 has no issues, a fix could be backported to 6.17 once it has been isolated.

here is a list of kernels available in repo updates-archive

$ dnf rq kernel --repo=updates-archive --releasever=42
Updating and loading repositories:
 Fedora 42 - x86_64 - Updates Archive                                                                                              100% |  34.4 KiB/s |   3.4 KiB |  00m00s
Repositories loaded.
kernel-0:6.14.1-300.fc42.x86_64
kernel-0:6.14.11-300.fc42.x86_64
kernel-0:6.14.2-300.fc42.x86_64
kernel-0:6.14.3-300.fc42.x86_64
kernel-0:6.14.4-300.fc42.x86_64
kernel-0:6.14.5-300.fc42.x86_64
kernel-0:6.14.6-300.fc42.x86_64
kernel-0:6.14.8-300.fc42.x86_64
kernel-0:6.14.9-300.fc42.x86_64
kernel-0:6.15.10-200.fc42.x86_64
kernel-0:6.15.3-200.fc42.x86_64
kernel-0:6.15.4-200.fc42.x86_64
kernel-0:6.15.5-200.fc42.x86_64
kernel-0:6.15.6-200.fc42.x86_64
kernel-0:6.15.7-200.fc42.x86_64
kernel-0:6.15.8-200.fc42.x86_64
kernel-0:6.15.9-201.fc42.x86_64
kernel-0:6.16.10-200.fc42.x86_64
kernel-0:6.16.11-200.fc42.x86_64
kernel-0:6.16.12-200.fc42.x86_64
kernel-0:6.16.3-200.fc42.x86_64
kernel-0:6.16.4-200.fc42.x86_64
kernel-0:6.16.5-200.fc42.x86_64
kernel-0:6.16.7-200.fc42.x86_64
kernel-0:6.16.8-200.fc42.x86_64
kernel-0:6.16.9-200.fc42.x86_64
kernel-0:6.17.4-200.fc42.x86_64
kernel-0:6.17.6-200.fc42.x86_64
kernel-0:6.17.7-200.fc42.x86_64

you could try kernel-0:6.14.11-300.fc42.x86_64 , kernel-0:6.15.10-200.fc42.x86_64, kernel-0:6.16.12-200.fc42.x86_64

install repo and disable it

sudo dnf install fedora-repos-archive.noarch
sudo dnf config-manager setopt updates-archive.enabled=0

install kernel with

sudo dnf install kernel-< version >.fc42.x86_64 --repo=updates-archive 

I didn’t see any signs of a nvme device in LOG.mp4. The PCI device hosting the nvme device (0000:04:00.0) is present though. The kernel command line contains nvme.noacpi=1. In full_failed_boot_log.txt the kernel is booted w/o noacpi=1.

Could that be the reason why there is no nvme device in LOG.mp4?

1 Like

I mentioned this likelihood and that you’d need a USB stick to save the journal/log to. So the issue sounds like this kernel version has a regression with this nvme. But only a complete dmesg for that boot will tell us - you could also just use dmesg > dmesg.log from tty9 instead of journalctl btw. But you still need a USB stick because if nvme is missing due to the bug, you won’t be able to save the log to the internal drive.

I probably wouldn’t even give 6.17 series anymore testing, I’d go to 6.18 series and see if the problem is there. That is where all current development is happening and any fix would go there first before being backported to older kernels.

This is a good catch.

OP needs to boot 6.17.7 without this boot parameter being set.