Fixing amdgpu, radeon missing modules after package install from external repo

I’ve managed to pretty thoroughly shoot myself in the foot installing…unendorsed? unsanctioned? unsanitary?..packages from a third-party repo. The install in question deployed a kernel mod so I know it screwed around with kernel configs, and one of the dependencies was kernel-devel, which in this case should perhaps have raised more red flags than it did.

At any rate, both the amdgpu and radeon modules are completely missing now, and I’m not sure about the best way to get them back. For the sake of illustration, I’m posting some of what I’m finding on my system below:

Running lspci -k | grep -A 3 -E '(VGA|3D)' yields

0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] (rev c1)
	Subsystem: Sapphire Technology Limited Device e448
0a:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller

Running find /lib/modules/$(uname -r) -type f -name '*.ko*' | egrep '(gpu|radeon)' yields

/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/display/drm_display_helper.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/gud/gud.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/hyperv/hyperv_drm.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/qxl/qxl.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/solomon/ssd130x-i2c.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/solomon/ssd130x-spi.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/solomon/ssd130x.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/bochs.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/cirrus.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/gm12u320.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/ili9163.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/ili9486.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/tiny/panel-mipi-dbi.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/ttm/ttm.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/udl/udl.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/vboxvideo/vboxvideo.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/vgem/vgem.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/virtio/virtio-gpu.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/vkms/vkms.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/vmwgfx/vmwgfx.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/drm_buddy.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/drm_cma_helper.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/drm_mipi_dbi.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/drm_ttm_helper.ko.xz
/lib/modules/6.0.9-200.fc36.x86_64/kernel/drivers/gpu/drm/drm_vram_helper.ko.xz

I’ll take that as evidence that amdgpu and radeon have disappeared into the ether.

My question is simply, what might be the least painful way to go about fixing this? My “superuser” skills are pretty rusty; I haven’t needed to dig into the kernel/OS like this in years.

I haven’t upgraded to F37 yet, so I wonder if I could expect to use the upgrade process to rectify the problem…or if it would (as my inner paranoic would tend to expect) simply exacerbate and/or create new problems. Of course, the rogue package install might have damaged more than I’ve noticed so far, so that gives me pause. I’m seeing some errors coming from kvm-amd in the logs now too, now that I think about it. Anyway, I’m just thinking out loud a bit and putting it out there. Any input from the community would be welcome.

EDIT: I’m my own worst grammar critic, so YES OF COURSE I made edits.

My thoughts:

  1. remove the extra packages if you can.

  2. run dnf repolist and identify any 3rd party repos you installed and disable them. The rpmfusion repos are not normally an issue but others may be.

  3. run sudo dnf upgrade --refresh followed by
    sudo dnf distro-sync --refresh --allowerasing to get things as closely in sync with the fedora repos as possible. This should install everything that should be there, but may not for those that were manually removed. It at least will get things straight for what is still installed and clean up any dependencies.

  4. Once the above is done then consider a version upgrade, but not before.

A version upgrade with inconsistencies is usually a bad idea and might break things if it can even be done. Fix any problems first.

That did the trick, thank you! It looks like the dnf upgrade --refresh instruction redownloaded and reinstalled both the kernel and the Radeon kernel mods, which appears to have undone the bulk of the damage. The bizarre kvm-amd error messages telling me that virtualization was disabled in the BIOS (it was not) are no longer showing up in the new journald logs and virtualization apps (snap, docker, etc.)–which I’d noticed had broken after posting–are also working again. It’s still amazing that one bad package install could do so much damage.

It’s possible there’s still some lingering breakage that hasn’t made itself known yet, so I’ll probably hold off on the upgrade for the moment. Anyway, thanks again for the save!

1 Like