Fedora KDE still using nouveau driver after installing akmod-nvidia, no secure boot

Hello all.

A few days ago I have installed a fresh Fedora 40 KDE and now I’m having some issues. This one might by the root cause of the other problems I’ve been having (3 second lag when opening applications such as konsole and dolphin).

Basically, I have installed nvidia drivers via rpmfusion. And I think things were behaving well until yesterday when the kernel as updated. Today I have noticed the nvidia drivers are not being used.

Here’s my information. Among strange things I have noticed nouveau driver appears to be blacklisted several times. But it’s still being used.

neofetch

             .',;::::;,'.                mkey@fedora 
         .';:cccccccccccc:;,.            ----------- 
      .;cccccccccccccccccccccc;.         OS: Fedora Linux 40 (KDE Plasma) x86_64 
    .:cccccccccccccccccccccccccc:.       Host: MS-7C91 2.0 
  .;ccccccccccccc;.:dddl:.;ccccccc;.     Kernel: 6.8.9-300.fc40.x86_64 
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.    Uptime: 22 mins 
.:ccccccccccccc;KMMc;cc;xMMc:ccccccc:.   Packages: 2351 (rpm) 
,cccccccccccccc;MMM.;cc;;WW::cccccccc,   Shell: bash 5.2.26 
:cccccccccccccc;MMM.;cccccccccccccccc:   Resolution: 2560x1440 
:ccccccc;oxOOOo;MMM0OOk.;cccccccccccc:   DE: Plasma 6.0.4 
cccccc:0MMKxdd:;MMMkddc.;cccccccccccc;   WM: kwin 
ccccc:XM0';cccc;MMM.;cccccccccccccccc'   Theme: Breeze-Dark [GTK2], Breeze [GTK3] 
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;    Icons: breeze-dark [GTK2/3] 
ccccc;0MNc.ccc.xMMd:ccccccccccccccc;     Terminal: konsole 
cccccc;dNMWXXXWM0::cccccccccccccc:,      CPU: AMD Ryzen 7 3700X (16) @ 3.600GHz 
cccccccc;.:odl:.;cccccccccccccc:,.       GPU: NVIDIA GeForce GTX 1070 
:cccccccccccccccccccccccccccc:'.         Memory: 3024MiB / 32005MiB 
.:cccccccccccccccccccccc:;,..
  '::cccccccccccccc::;,.                                         
                                                                 

inxi -G → N.B. this command hangs for almost a minute

Graphics:
  Device-1: NVIDIA GP104 [GeForce GTX 1070] driver: nouveau v: kernel
  Display: wayland server: X.org v: 1.20.14 with: Xwayland v: 23.2.6
    compositor: kwin_wayland driver: N/A resolution: 2560x1440
  API: EGL v: 1.5 drivers: nouveau,swrast
    platforms: gbm,wayland,x11,surfaceless,device
  API: OpenGL v: 4.5 compat-v: 4.3 vendor: mesa v: 24.0.7 renderer: NV134
  API: Vulkan v: 1.3.280 drivers: N/A surfaces: xcb,xlib,wayland

lspci -k | grep -A 3 -E "(VGA|3D)"

2b:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3301
        Kernel driver in use: nouveau
        Kernel modules: nouveau, nvidia_drm, nvidia

dnf list installed '*nvidia*'

Installed Packages
akmod-nvidia.x86_64                                   3:550.78-1.fc40               @rpmfusion-nonfree-updates
kmod-nvidia-6.8.8-300.fc40.x86_64.x86_64              3:550.78-1.fc40               @@commandline             
kmod-nvidia-6.8.9-300.fc40.x86_64.x86_64              3:550.78-1.fc40               @@commandline             
nvidia-gpu-firmware.noarch                            20240410-1.fc40               @updates                  
nvidia-modprobe.x86_64                                3:550.78-1.fc40               @rpmfusion-nonfree-updates
nvidia-settings.x86_64                                3:550.78-1.fc40               @rpmfusion-nonfree-updates
xorg-x11-drv-nvidia.x86_64                            3:550.78-1.fc40               @rpmfusion-nonfree-updates
xorg-x11-drv-nvidia-cuda-libs.x86_64                  3:550.78-1.fc40               @rpmfusion-nonfree-updates
xorg-x11-drv-nvidia-kmodsrc.x86_64                    3:550.78-1.fc40               @rpmfusion-nonfree-updates
xorg-x11-drv-nvidia-libs.x86_64                       3:550.78-1.fc40               @rpmfusion-nonfree-updates
xorg-x11-drv-nvidia-power.x86_64                      3:550.78-1.fc40               @rpmfusion-nonfree-updates

The journal is full of these.

May 14 23:34:42 fedora kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 510
May 14 23:35:43 fedora systemd[1639]: Starting grub-boot-success.service - Mark boot as successful...
May 14 23:35:43 fedora systemd[1639]: Finished grub-boot-success.service - Mark boot as successful.
May 14 23:38:28 fedora PackageKit[2654]: uid 1000 is trying to obtain org.freedesktop.packagekit.system-sources-refresh auth (only_trusted:0)
May 14 23:38:28 fedora PackageKit[2654]: uid 1000 obtained auth for org.freedesktop.packagekit.system-sources-refresh
May 14 23:38:32 fedora PackageKit[2654]: refresh-cache transaction /358_aebccadc from uid 1000 finished with success after 4005ms
May 14 23:38:33 fedora PackageKit[2654]: get-updates transaction /359_edbccded from uid 1000 finished with success after 660ms
May 14 23:38:43 fedora systemd[1639]: Starting systemd-tmpfiles-clean.service - Cleanup of User's Temporary Files and Directories...
May 14 23:38:43 fedora systemd[1639]: Finished systemd-tmpfiles-clean.service - Cleanup of User's Temporary Files and Directories.
May 14 23:40:44 fedora plasmashell[3335]: /usr/bin/AppImageLauncher: /lib64/libcurl.so.4: no version information available (required by /usr/bin/../lib/x86>
May 14 23:40:44 fedora systemd[1639]: Started app-appimagekit_aac6279a8ee6f1f08e59f07070bd5c2d\x2dThorium_Browser-a2175bc372a94cea8b4e05de4617893a.scope - >
May 14 23:40:46 fedora kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 510
May 14 23:40:46 fedora kernel: NVRM: GPU 0000:2b:00.0 is already bound to nouveau.
May 14 23:40:46 fedora kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s).
May 14 23:40:46 fedora kernel: NVRM: This can occur when another driver was loaded and 
                               NVRM: obtained ownership of the NVIDIA device(s).
May 14 23:40:46 fedora kernel: NVRM: Try unloading the conflicting kernel module (and/or
                               NVRM: reconfigure your kernel without the conflicting
                               NVRM: driver(s)), then try loading the NVIDIA kernel module
                               NVRM: again.
May 14 23:40:46 fedora kernel: NVRM: No NVIDIA devices probed.
May 14 23:40:46 fedora kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 510

cat /etc/default/grub → blacklisted 4 times??

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.driver.blacklist=nouveau modprobe.blacklist=nouveau rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

cat /proc/cmdline

ro root=LABEL=fedora initrd=boot\initramfs-6.8.9-300.fc40.x86_64.img

cat /etc/kernel/cmdline → bit strange that this one uses the UUID, even though I am using labels in fstab

root=UUID=8b518b5d-50d0-4973-ad76-db8d9397d1e3 ro rd.driver.blacklist=nouveau modprobe.blacklist=nouveau rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau 

mokutil --sb-state

SecureBoot disabled

I tried to force the rebuilding of the modules, and the process basically failed. Apparently, the rpm was built, but I had to reinstall it manually. Nothing changed after a reboot.

sudo akmods --akmod nvidia --rebuild --force

Somewhat related, seems like this case was solved by signing the driver:

Thanks.

I have had similar issues when I rebooted too soon after a kernel or driver update.
For me the fix was to

  1. sudo dnf remove kmod-nvidia-$(uname -r) which removes the kmod for the currently booted kernel, and should do what you wish assuming you are booted into the newest kernel. We cannot tell that with the very limited inxi information you provided.
  2. sudo akmods --force --kernels $(uname -r) to rebuild the modules for only the running kernel.
  3. wait for step 2 to complete then wait another minute or so before rebooting.

It seems that sometimes the driver may become corrupted and unusable if the system is rebooted too soon after the update ends.

Note that my system using GTX 1050 gpus is still running f39 and also is using an older driver version (535.129.03) since the version of cuda with that driver is 12.2 and that gpu fails to function with any newer cuda version. The latest 550 driver provides cuda 12.4. The GPU functions graphically with the newer drivers but seems it cannot run cuda versions 12.3 or 12.4.

1 Like

Hi, yes I’m running on the latest available kernel 6.8.9. I ran again this command, which I also ran previously, even if on that occasion the rebuild/reinstall failed. Now it went through without issues. Now i’ll give it some time before reboot. It’s silly to not have a visual confirmation of the process being completed.

mkey@fedora:~$ sudo dnf remove kmod-nvidia-$(uname -r)
[sudo] password for mkey: 
Dependencies resolved.
============================================================================================================================================================
 Package                                              Architecture              Version                              Repository                        Size
============================================================================================================================================================
Removing:
 kmod-nvidia-6.8.9-300.fc40.x86_64                    x86_64                    3:550.78-1.fc40                      @@commandline                     41 M

Transaction Summary
============================================================================================================================================================
Remove  1 Package

Freed space: 41 M
Is this ok [y/N]: y
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                    1/1 
  Erasing          : kmod-nvidia-6.8.9-300.fc40.x86_64-3:550.78-1.fc40.x86_64                                                                           1/1 
  Running scriptlet: kmod-nvidia-6.8.9-300.fc40.x86_64-3:550.78-1.fc40.x86_64                                                                           1/1 

Removed:
  kmod-nvidia-6.8.9-300.fc40.x86_64-3:550.78-1.fc40.x86_64                                                                                                  

Complete!
mkey@fedora:~$ sudo akmods --force --kernels $(uname -r)
Checking kmods exist for 6.8.9-300.fc40.x86_64 [  OK  ]
Building and installing nvidia-kmod [  OK  ]

OK, this one my be my fault (I know, seems almost impossible). I use rEFIned as a helper for my booting needs and it was booting straight from the fedora system disk.

Now that I started from fedora’s grub, things looks different.

mkey@fedora:~$ lspci -k | grep -A 3 -E "(VGA|3D)"
2b:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 3301
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia
mkey@fedora:~$ inxi -G
Graphics:
  Device-1: NVIDIA GP104 [GeForce GTX 1070] driver: nvidia v: 550.78
  Display: wayland server: X.org v: 1.20.14 with: Xwayland v: 23.2.6
    compositor: kwin_wayland driver: N/A resolution: 2560x1440
  API: EGL v: 1.5 drivers: nvidia,swrast,zink
    platforms: gbm,wayland,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.78
    renderer: NVIDIA GeForce GTX 1070/PCIe/SSE2
  API: Vulkan v: 1.3.280 drivers: N/A surfaces: xcb,xlib,wayland
mkey@fedora:~$ uname -r
6.8.9-300.fc40.x86_64

Also the 3 second application start lag is now gone. I do have some strange input lag in the browser, I’ll see what I can do about that. It just started after the last reboot, it was butter smooth up to that point.

Not having visual confirmation is a consequence of the contortions needed to install “closed source” drivers.

Understood. Obviously very unfortunate.

Not sure how to proceed. Shall we mark this a solved? Basically the solution is to do things properly :smiley:

Looks like you encountered “attractive nuisance” issues that needed extra care to avoid:

  • waiting for akmod-nividia to finish configuring the module after it has been built

  • rEFInd booting from system disk

It has been years since I used rEFInd. Is this a failure to inform rEFInd of the changes made with the update?

I don’t know of a good way to help users avoid “attractive nuisances”. The term comes from things like failing to put a fence around a swimming pool, so the basic idea is to make it harder for users to just run such commands without fully understanding the potential failure modes.

Actually, the first point was not the issue as I had allowed for ample time for the process to complete.

The issue with rEFInd is likely due to it not being aware of the kernel command line options when booting the OS partition directly.

I ended up mucking about with rEFInd due to having a previous EFI partition formatted as fat32 while fedora installer demanded fat16. So I moved the bootloader manually after installing on a temporary fat16 EFI partition and had several (8?) Fedora entries in the rEFInd menu and managed to select the wrong one.

The cheesy script that nvidia uses(used?) to inject the edits into GRUB_CMDLINE_LINUX didn’t check for already existing blacklist args so at some point it did it more than once. You could go in the /etc/default/grub file with an editor and remove the left side extra ones and leave the ones on the right.
I suspect it’s not a fatal problem, it’s like having initializations in a C++ constructor where the same inits are done twice. It doesn’t hurt but is sloppy

2 Likes

Added refind