I’m on Fedora 41 using KDE. Yesterday when I started my desktop, it booted into an emergency shell. I burned a live disk, chrooted in to my system, and set the root password. Then I rebooted, logged in to the emergency shell, and if I remember the order of events correctly, found that /lib/modules was missing and reinstalled all of the kernel* packages, which repopulated the directory.
I rebooted again and didn’t get dropped into an emergency shell, but the Plymouth boot screen was much larger than normal, like it was set for a lower resolution and stretched out to fit my monitor. KDE runs extremely slow and my CPU usage is almost at 100% on all cores, with ~93% of each core used by /usr/bin/kwin_wayland. If I switch to a tty terminal, the CPU usage drops to a normal idle state (1-3%).
I looked through dmesg and journalctl and the probable cause seems to be this portion of dmesg:
[ 5.706491] [drm] amdgpu kernel modesetting enabled.
[ 5.706682] amdgpu: Virtual CRAT table created for CPU
[ 5.706693] amdgpu: Topology: Add CPU node
[ 5.706880] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x7480 0x148C:0x2421 0xCF).
[ 5.706891] [drm] register mmio base: 0xFBB00000
[ 5.706892] [drm] register mmio size: 1048576
[ 5.710084] [drm] add ip block number 0 <soc21_common>
[ 5.710086] [drm] add ip block number 1 <gmc_v11_0>
[ 5.710088] [drm] add ip block number 2 <ih_v6_0>
[ 5.710089] [drm] add ip block number 3
[ 5.710090] [drm] add ip block number 4
[ 5.710091] [drm] add ip block number 5
[ 5.710092] [drm] add ip block number 6 <gfx_v11_0>
[ 5.710093] [drm] add ip block number 7 <sdma_v6_0>
[ 5.710094] [drm] add ip block number 8 <vcn_v4_0>
[ 5.710095] [drm] add ip block number 9 <jpeg_v4_0>
[ 5.710096] [drm] add ip block number 10 <mes_v11_0>
[ 5.721843] [drm] BIOS signature incorrect 0 0
[ 5.721854] amdgpu 0000:03:00.0: No more image in the PCI ROM
[ 5.721872] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from ROM BAR
[ 5.721874] amdgpu: ATOM BIOS: 113-EXT85100-001
[ 5.721906] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/psp_13_0_7_sos.bin failed with error -2
[ 5.721908] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block failed -19
[ 5.722431] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/smu_13_0_7.bin failed with error -2
[ 5.722433] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block failed -19
[ 5.722801] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/dcn_3_2_1_dmcub.bin failed with error -2
[ 5.722803] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block failed -19
[ 5.723174] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/gc_11_0_2_pfp.bin failed with error -2
[ 5.723175] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block <gfx_v11_0> failed -19
[ 5.723557] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/sdma_6_0_2.bin failed with error -2
[ 5.723561] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block <sdma_v6_0> failed -19
[ 5.723931] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/vcn_4_0_4.bin failed with error -2
[ 5.723933] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block <vcn_v4_0> failed -19
[ 5.724305] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/gc_11_0_2_mes_2.bin failed with error -2
[ 5.724307] amdgpu 0000:03:00.0: amdgpu: try to fall back to gc_11_0_2_mes.bin
[ 5.724324] amdgpu 0000:03:00.0: Direct firmware load for amdgpu/gc_11_0_2_mes.bin failed with error -2
[ 5.724325] [drm:amdgpu_device_init.cold [amdgpu]] ERROR early_init of IP block <mes_v11_0> failed -19
[ 5.724685] amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init
[ 5.724691] amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
I tried various combinations of reinstalling the firmware and kernel packages until I finally reinstalled every package on the system with dnf reinstall $(rpm -qa --qf="%{N}-%{V}\n" | sort) --skip-unavailable
as root, but I still get this behavior. The amdgpu module is loaded and the amdgpu driver is shown in the output of lsinitrd
. The graphics card is a Radeon RX 7600.
Possible causes of this issue: I performed an upgrade on 10/15 and installed a piece of legacy software, WordPerfect 8.1 for Linux, using the script file for Fedora at Installing WordPerfect 8.1 for Linux on a distro current in or after 2019 on 10/16. I don’t remember if I rebooted between those events, but the first entry in my dnf history from my attempts to fix the problem are from 10/17.
For completeness, I tried installing Sway. In Sway, the CPU is at idle, but I have no hardware acceleration if I try to play a video in a browser. Also, in KDE, my brightness control usually has a title for the monitor connected via displayport that says something like “Acer XV370” but it currently says “Unknown-1”.
I’ve also taken the most recent update for Fedora 41, which had kernel-6.11.4-300 and associated packages in it, but the problem persists even on the new kernel.
Any ideas/info needed?