Constant random crashing, unable to identify cause

I have been getting constant crashes on the desktop, or while ding random things that are forcing me to restart without being able to save my work.

There are no crash reports anywhere, the screen just freezes and becomes unresponsive, and even after waiting for a few hours once, there was no change or message regarding the crash.

I have been running this system smoothly since 42, with no issues or instability.

I would greatly appreciate any help in finding out what could be causing this and how to fix it.
I have run sudo dnf distro-sync --refresh --allowerasing to ensure I am up to date in case any bugfix solved it but after a few days of seeing these crashes, I haven’t been able to figure out anything.

My main suspicions are towards kwin-wayland, or plasma-desktop, but I have no way of confirming.

OS: Fedora Linux 43 (KDE Plasma Desktop Edition) x86_64
Kernel: Linux 6.17.12-300.fc43.x86_64
Shell: bash 5.3.0
DE: KDE Plasma 6.5.4
WM: KWin (Wayland)
CPU: AMD Ryzen 7 9700X (16) @ 5.58 GHz
GPU 1: AMD Radeon RX 9070 XT [Discrete]
GPU 2: AMD Radeon Graphics [Integrated]
Memory: 7.73 GiB / 30.50 GiB (25%)
Disk (/): 476.63 GiB / 1.82 TiB (26%) - btrfs

Motherboard is ASUS PRIME B650M-A

If kwin is crashing there will be logs in the user journal.
You may see other logs of interest in the system journal.

See man journalctl for options to access journal logs.
For example

journalctl --user -b 0
journalctl --user -b 0 -g kwin
sudo journalctl -b 0

EDIT: I found the issue, it has something to do with the usb controller shutting down, causing the ssd to register as unplugged. I had to use netconsole to find it which took a while to understand and set up.

I got these logs before every crash

"[6172.429670] xhci_hcd 0000:0d:00.3: Controller not ready at resume -19

[ 6172.429689] xhci_hcd 0000:0d:00.3: PCI post-resume error -19!

[ 6172.429693] xhci_hcd 0000:0d:00.3: HC died; cleaning up

[ 6172.429704] usb 3-1: USB disconnect, device number 2"

doing

sudo grubby --update-kernel=ALL --args “pcie_port_pm=off, usbcore.autosuspend=-1, iommu=pt”

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

seems to have fixed it, I don’t understand exactly how this happened but it seems to be a firmware issue.

1 Like