Nvidia kernel module missing - falling back to nouveau - during update

Hello,
I had a functional system Fedora 38, running on kernel 6.4.10 and nvidia driver 535.98.
During a routine update to kernel 6.4.11 I have received at 97% a kernel panic message. So I have to hard reset. See the picture.
I tried later to update from 6.4.10 to newer 6.4.12 but with same result.
Now my only functional configuration is the 6.4.10 and I cannot update it.
When I try to boot the newer 6.4.11 or 6.4.12 from grub I receive immediately a message “Nvidia kernel module missing - falling back to nouveau” and the screen is blinking and booting stops - it never falls to nouveau.

Please any help? Thanks

wikipedia says: “Machine checks are a hardware problem, not a software problem.” There is a grey area where firmware bugs are detected as a hardware problem. You may have firmware that is not compatible with newer kernels, or the newer kernel may be using the hardware differently so a latent problem is triggered.

You should check for firmware updates and run memtest86+ and also a GPU test program.

If you need additional help, please provide details of your hardware configuration. Start with the output of sudo lshw -class display -sanitize (as text using the </> button).

Thanks. Here is the output:

  *-display                 
       description: VGA compatible controller
       product: GP108 [GeForce GT 1030]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:01:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
       configuration: driver=nvidia latency=0
       resources: irq:28 memory:fd000000-fdffffff memory:d0000000-dfffffff memory:ce000000-cfffffff ioport:d800(size=128) memory:c0000-dffff

The CPU is following: AMD Athlon 64 X2 Dual 6000+

Strange thing is, that the configuration of kernel 6.4.10 and nvidia driver 535.98 works like charm. But when I select 6.4.11, than the message “Nvidia kernel module missing…” appears and boot stops.

when I run dnf list installed \*nvidia\*

kmod-nvidia-6.4.10-200.fc38.x86_64.x86_64         3:535.98-1.fc38           @@commandline                   
nvidia-gpu-firmware.noarch                        20230804-153.fc38         @updates                        
nvidia-settings.x86_64                            3:535.98-1.fc38           @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia.x86_64                        3:535.98-2.fc38           @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.x86_64              3:535.98-2.fc38           @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-libs.x86_64                   3:535.98-2.fc38           @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-power.x86_64                  3:535.98-2.fc38           @rpmfusion-nonfree-nvidia-driver

dmesg | grep -iE 'secure|nouveau|nvidia'

[    0.000000] Command line: BOOT_IMAGE=(hd0,msdos2)/boot/vmlinuz-6.4.10-200.fc38.x86_64 root=UUID=e19c7b56-d820-4ea8-9ba0-9d0c9ecf14f1 ro resume=UUID=1f4f9c54-9ada-45ff-95b3-a24000a81d5f text initcall_blacklist=simpledrm_platform_driver_init rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[    0.044161] Kernel command line: BOOT_IMAGE=(hd0,msdos2)/boot/vmlinuz-6.4.10-200.fc38.x86_64 root=UUID=e19c7b56-d820-4ea8-9ba0-9d0c9ecf14f1 ro resume=UUID=1f4f9c54-9ada-45ff-95b3-a24000a81d5f text initcall_blacklist=simpledrm_platform_driver_init rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[   10.118825] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:02.0/0000:01:00.1/sound/card1/input11
[   10.118924] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:02.0/0000:01:00.1/sound/card1/input12
[   10.118996] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:02.0/0000:01:00.1/sound/card1/input13
[   10.119064] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:02.0/0000:01:00.1/sound/card1/input14
[   16.650631] nvidia: loading out-of-tree module taints kernel.
[   16.650649] nvidia: module license 'NVIDIA' taints kernel.
[   16.650654] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[   16.650655] nvidia: module license taints kernel.
[   17.411842] nvidia-nvlink: Nvlink Core is being initialized, major device number 237
[   17.414204] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[   17.654973] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  535.98  Tue Aug  1 21:42:05 UTC 2023
[   17.980475] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[   18.516827] nvidia-uvm: Loaded the UVM driver, major device number 235.
[   19.338856] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  535.98  Tue Aug  1 21:40:14 UTC 2023
[   19.358889] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[   19.851207] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0

This shows the akmod-nvidia package is missing so it cannot build the newer drivers for kernel upgrades.

I would suggest that you run a special upgrade procedure to reinstall the nvidia drivers.

  1. sudo dnf install akmod-nvidia
    If this does not work properly then do the steps following.

  2. sudo dnf remove '*nvidia*' --exclude nvidia-gpu-firmware

  3. Install the new nvidia drivers with sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda

  4. wait for the rebuild to complete (about 5 minutes) then reboot. The drivers should now load with the newer kernel.

Note that the newer nvidia drivers are at present version 535.104.05

Thanks a lot! It worked. I have just used the first command.

1 Like

Glad to know it worked. :+1: