Nvidia kernel module missing

After the next kernel update I got “Nvidia kernel module missing” on the black screen before login screen appears.

journalctl -b -g nvidia says

окт 03 19:46:54 fedora kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input19
окт 03 19:46:54 fedora kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input20
окт 03 19:46:54 fedora kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input21
окт 03 19:46:54 fedora kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card0/input22
окт 03 19:46:55 fedora alsactl[1145]: Found hardware: "HDA-Intel" "Nvidia GPU 9f HDMI/DP" "HDA:10de009f,146212e8,00100100" "0x1462" "0x12e8"
окт 03 19:46:57 fedora systemd[1]: Starting nvidia-fallback.service - Fallback to nouveau as nvidia did not load...
окт 03 19:46:57 fedora kernel: nvidia: loading out-of-tree module taints kernel.
окт 03 19:46:57 fedora kernel: nvidia: module license 'NVIDIA' taints kernel.
окт 03 19:46:57 fedora kernel: nvidia: module license taints kernel.
окт 03 19:46:57 fedora kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 509
окт 03 19:46:57 fedora kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003)
окт 03 19:46:57 fedora kernel: nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none
окт 03 19:46:57 fedora kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  560.35.03  Fri Aug 16 21:39:15 UTC 2024
окт 03 19:46:58 fedora kernel: nvidia 0000:01:00.0: optimus capabilities: enabled, status dynamic power, hda bios codec supported
окт 03 19:46:58 fedora systemd[1]: Finished nvidia-fallback.service - Fallback to nouveau as nvidia did not load.
окт 03 19:46:58 fedora audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nvidia-fallba>
окт 03 19:46:58 fedora kernel: nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
окт 03 19:46:58 fedora kernel: nvidia-uvm: Loaded the UVM driver, major device number 507.
окт 03 19:46:59 fedora kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  560.35.03  Fri Aug 16 21:21:48 UTC 20>
окт 03 19:46:59 fedora kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
окт 03 19:47:01 fedora kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0
окт 03 19:47:01 fedora kernel: nvidia 0000:01:00.0: [drm] Cannot find any crtc or sizes
окт 03 19:47:21 fedora systemd[1977]: Starting app-nvidia\x2dsettings\x2duser@autostart.service - nvidia-settings...
окт 03 19:47:21 fedora systemd[1977]: Started app-nvidia\x2dsettings\x2duser@autostart.service - nvidia-settings.
окт 03 19:47:21 fedora systemd[1977]: app-nvidia\x2dsettings\x2duser@autostart.service: Main process exited, code=exited, status=1/FAILURE
окт 03 19:47:21 fedora systemd[1977]: app-nvidia\x2dsettings\x2duser@autostart.service: Failed with result 'exit-code'.

I removed dnf remove akmod-nvidia xorg-x11-drv-nvidia-cuda
and reinstall dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda but it has not helped.

nvidia-smi
Thu Oct  3 19:54:54 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   38C    P3             18W /   40W |       2MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

According to your log output, it looks like a race where the module is been loaded while our fallback script is still occuring.

As soon as nouveau cannot get loaded, it will be safe to discard such error, but we might find the root cause of the problem and why did it occurs with current f40 kernels…

Can you show the output of:

systemctl status nvidia-fallback.service

systemctl status nvidia-fallback.service
● nvidia-fallback.service - Fallback to nouveau as nvidia did not load
Loaded: loaded (/usr/lib/systemd/system/nvidia-fallback.service; enabled; preset: disabled)
Drop-In: /usr/lib/systemd/system/service.d
└─10-timeout-abort.conf
Active: active (exited) since Thu 2024-10-03 20:18:14 +05; 9min ago
Process: 1316 ExecStart=/sbin/modprobe nouveau (code=exited, status=0/SUCCESS)
Process: 1367 ExecStartPost=/bin/plymouth message --text=NVIDIA kernel module missing. Falling back to nouveau (code=exited, status=0/SUCCESS)
Main PID: 1316 (code=exited, status=0/SUCCESS)
CPU: 1.073s

окт 03 20:18:13 fedora systemd[1]: Starting nvidia-fallback.service - Fallback to nouveau as nvidia did not load...
окт 03 20:18:14 fedora systemd[1]: Finished nvidia-fallback.service - Fallback to nouveau as nvidia did not load.

So, can you report an issue to https://bugzilla.rpmfusion.org
with the output of the archive created by sudo nvidia-bug-report.sh attached.

Thanks in advance.
We will have to figure out a mean to properly delay the script until the nvidia driver is fully loaded…

Any other reproducer ? (just add :-1: if you have the issue).

https://bugzilla.rpmfusion.org/show_bug.cgi?id=7070

I have the exact same issue. Adding the output of systemctl status nvidia-fallback.service:

○ nvidia-fallback.service - Fallback to nouveau as nvidia did not load
     Loaded: loaded (/usr/lib/systemd/system/nvidia-fallback.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead)
  Condition: start condition unmet at Sat 2024-10-05 13:24:59 CEST; 6min ago
             └─ ConditionKernelCommandLine=rd.driver.blacklist=nouveau was not met

Oct 05 13:24:59 local systemd[1]: nvidia-fallback.service - Fallback to nouveau as nvidia did not load was skipped because of an unmet condition check (ConditionKernelCommandLine=rd.driver.blacklist=nouveau).

If I add rd.driver.blacklist=nouveau to the GRUB startup script I am stuck on a black screen after GRUB starts booting.

Please post the result of cat /proc/cmdline and cat /etc/kernel/cmdline as well as lsmod | grep -iE 'nvidia|nouveau' and cat /etc/default/grub so we may see what is properly configured (or not so).

Does this fix it?

sudo dracut -fvv --add-drivers " nvidia nvidia-drm nvidia-modeset nvidia-uvm "
1 Like
cat /proc/cmdline
BOOT_IMAGE=(hd1,gpt2)/vmlinuz-6.10.11-200.fc40.x86_64 root=UUID=4dbb50ae-2790-4cb1-a62a-77a502796adc ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau

cat /etc/kernel/cmdline
root=UUID=4dbb50ae-2790-4cb1-a62a-77a502796adc ro rootflags=subvol=root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau

lsmod | grep -iE 'nvidia|nouveau'
nvidia_drm            135168  2
nvidia_modeset       1650688  1 nvidia_drm
nvidia_uvm           6844416  0
nvidia              72577024  5 nvidia_uvm,nvidia_modeset
video                  81920  4 msi_wmi,xe,i915,nvidia_modeset

at /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
GRUB_FONT=/boot/grub2/fonts/notosans24.pf2
sudo dracut -fvv --add-drivers " nvidia nvidia-drm nvidia-modeset nvidia-uvm "

fixed the black screen for me. Thank you!

I saved the output of lsinitrd before and after running the above command. I can post the result/diff if it’s of any use.

This has helped me too, thanks!

Note that this is only a workaround as you will need to manually re-create the initramfs on each kernel update manually.

This is also not recommended as you will then break your setup if you miss to update the kmod on a driver version upgrade. That’s because you cannot have a version missmatch between the kernel module and the userspace driver.

3 Likes