NVIDIA drivers installation (Fedora 38) - other tutorials failed

Dear community,

Here is another “almost the same” question about installing, activating and signing the NVIDIA driver(s). Firstly, I would say that I’ve already tried the following guides which failed at some point (or at the very end):

To sum up, I’m using Fedora 38,

  • inxi -G ouputs:
$ inxi -G
Graphics:
  Device-1: Intel Alder Lake-P Integrated Graphics driver: i915 v: kernel
  Device-2: NVIDIA GA107GLM [RTX A2000 8GB Laptop GPU] driver: N/A
  Device-3: Microdia Integrated_Webcam_HD driver: uvcvideo type: USB
  Display: wayland server: X.Org v: 22.1.9 with: Xwayland v: 22.1.9
    compositor: gnome-shell v: 44.4 driver: X: loaded: modesetting
    unloaded: fbdev,vesa dri: iris gpu: i915 resolution: 1920x1200~60Hz
  API: OpenGL v: 4.6 Mesa 23.1.7 renderer: Mesa Intel Graphics (ADL GT2)
  • lspci | grep VGA:
lspci | grep VGA
0000:00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)
  • 3rd:
lspci | grep 3D
0000:01:00.0 3D controller: NVIDIA Corporation GA107GLM [RTX A2000 8GB Laptop GPU] (rev a1)
  • 4th:
lspci | grep -i nvidia
0000:01:00.0 3D controller: NVIDIA Corporation GA107GLM [RTX A2000 8GB Laptop GPU] (rev a1)
  • 5th:
neofetch
             .',;::::;,'.                dr_gon_s@fedora 
         .';:cccccccccccc:;,.            --------------- 
      .;cccccccccccccccccccccc;.         OS: Fedora Linux 38 (Workstation Edition) x86_64 
    .:cccccccccccccccccccccccccc:.       Host: Precision 5570 
  .;ccccccccccccc;.:dddl:.;ccccccc;.     Kernel: 6.4.15-200.fc38.x86_64 
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.    Uptime: 35 mins 
.:ccccccccccccc;KMMc;cc;xMMc:ccccccc:.   Packages: 2032 (rpm) 
,cccccccccccccc;MMM.;cc;;WW::cccccccc,   Shell: bash 5.2.15 
:cccccccccccccc;MMM.;cccccccccccccccc:   Resolution: 1920x1200 
:ccccccc;oxOOOo;MMM0OOk.;cccccccccccc:   DE: GNOME 44.4 
cccccc:0MMKxdd:;MMMkddc.;cccccccccccc;   WM: Mutter 
ccccc:XM0';cccc;MMM.;cccccccccccccccc'   WM Theme: Adwaita 
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;    Theme: Adwaita [GTK2/3] 
ccccc;0MNc.ccc.xMMd:ccccccccccccccc;     Icons: Adwaita [GTK2/3] 
cccccc;dNMWXXXWM0::cccccccccccccc:,      Terminal: gnome-terminal 
cccccccc;.:odl:.;cccccccccccccc:,.       CPU: 12th Gen Intel i7-12800H (20) @ 4.700GHz 
:cccccccccccccccccccccccccccc:'.         GPU: NVIDIA RTX A2000 8GB Laptop GPU 
.:cccccccccccccccccccccc:;,..            GPU: Intel Alder Lake-P 
  '::cccccccccccccc::;,.                 Memory: 2892MiB / 31752MiB 

My final goal is to install properly NVIDIA (+ CUDA) drivers and (in addition) to use commands lsmod | grep nvidia (so that it outputs something AND nvidia-smi.

Could you please help in resolving this issue since I already spent a huge amount of time with it?

Thank you in advance.

You have not said what you actually did and why you think it did not work.

Did you follow this rpmfusion nvidia driver guide? Howto/NVIDIA - RPM Fusion

If not then recommend you remove any that are not rpmfusion drivers and try the rpm fusion packaged drivers.

In addition to what Barry said above, I deal with users having nvidia problems here a lot.

Steps 1-5 on the blog you linked are correct, but step 6 is not.

To fully install the nvidia drivers from rpmfusion you need to

  1. enable the 3rd party repos in the gnome software app.
  2. from the command line install the drivers with
    sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda
  3. Allow about 5 minutes after the install completes for the drivers to be compiled then reboot.

At this point the drivers should load and be functional

The installed nvidia packages can be seen with dnf list installed \*nvidia\*. Please post that so we can see if everything needed is installed.

The loaded modules for the GPU can be seen with lsmod | grep -iE 'nouveau|nvidia' and if it does not show lines with nvidia modules then the modules did not load properly.

The boot process loading the drivers can be seen with dmesg | grep -iE 'nvidia|secure|nouveau'. Please also post that for us.

Likely, yes, I think that I followed that approach. At least, RPM packages are on. Is there a way to check whether I have drivers from there or not?

The information you asked to post:

$ sudo dnf list installed \*nvidia\*
Installed Packages
akmod-nvidia.x86_64                                                                      3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
kmod-nvidia-6.2.9-300.fc38.x86_64.x86_64                                                 3:535.104.05-1.fc38                                                  @@commandline                   
nvidia-persistenced.x86_64                                                               3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
nvidia-settings.x86_64                                                                   3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia.x86_64                                                               3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda.x86_64                                                          3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.i686                                                       3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.x86_64                                                     3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-kmodsrc.x86_64                                                       3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-libs.i686                                                            3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-libs.x86_64                                                          3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-power.x86_64                                                         3:535.104.05-1.fc38                                                  @rpmfusion-nonfree-nvidia-driver
lsmod | grep -iE 'nouveau|nvidia'
nouveau              3416064  0
mxm_wmi                12288  1 nouveau
drm_ttm_helper         12288  1 nouveau
i2c_algo_bit           20480  2 i915,nouveau
drm_display_helper    208896  2 i915,nouveau
video                  77824  4 dell_wmi,dell_laptop,i915,nouveau
ttm                    98304  3 drm_ttm_helper,i915,nouveau
wmi                    45056  9 dell_wmi_sysman,video,dell_wmi_ddv,dell_wmi,wmi_bmof,dell_smbios,dell_wmi_descriptor,mxm_wmi,nouveau
dmesg | grep -iE 'nvidia|secure|nouveau'
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt4)/vmlinuz-6.4.15-200.fc38.x86_64 root=UUID=bcdg36-fd42-46c3-a130-d288439bee5ab ro rootflags=subvol=root rd.luks.uuid=luks-fdsbne121-473-fb-b3ca-6bf2496deaf9 rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[    0.000000] secureboot: Secure boot enabled
[    0.000000] Kernel is locked down from EFI Secure Boot mode; see man kernel_lockdown.7
[    0.004867] secureboot: Secure boot enabled
[    0.043198] Kernel command line: BOOT_IMAGE=(hd0,gpt4)/vmlinuz-6.4.15-200.fc38.x86_64 root=UUID=bcdg36-fd42-46c3-a130-d288439bee5ab ro rootflags=subvol=root rd.luks.uuid=luks-fdsbne121-473-fb-b3ca-6bf2496deaf9 rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[    1.285754] integrity: Loaded X.509 cert 'Fedora Secure Boot CA: fde3432c2d61cvrrt5335d7b20e4cs63b42'
[    1.285915] integrity: Loaded X.509 cert 'NVIDIA Module Signing MOK sertificate: wdc5361fe1f4bfa89fcc3bd643befbcawgd4'
[   17.455171] Bluetooth: hci0: Secure boot is enabled
[   18.972947] nouveau: detected PR support, will not use DSM
[   18.973237] nouveau 0000:01:00.0: NVIDIA GA107 (b77000a1)
[   19.070691] nouveau 0000:01:00.0: bios: version 94.07.5b.00.85
[   19.071742] nouveau 0000:01:00.0: acr: firmware unavailable
[   19.071894] nouveau 0000:01:00.0: gr: firmware unavailable
[   19.071930] nouveau 0000:01:00.0: sec2: firmware unavailable
[   19.072028] nouveau 0000:01:00.0: fb: 8192 MiB GDDR6
[   19.075764] nouveau 0000:01:00.0: fb: VPR locked, but no scrubber binary!
[   19.078803] nouveau 0000:01:00.0: DRM: VRAM: 8192 MiB
[   19.078805] nouveau 0000:01:00.0: DRM: GART: 536870912 MiB
[   19.078806] nouveau 0000:01:00.0: DRM: BIT table 'A' not found
[   19.078807] nouveau 0000:01:00.0: DRM: BIT table 'L' not found
[   19.078808] nouveau 0000:01:00.0: DRM: Pointer to TMDS table not found
[   19.078809] nouveau 0000:01:00.0: DRM: DCB version 4.1
[   19.079305] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[   19.079599] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 1
[   19.079646] nouveau 0000:01:00.0: [drm] No compatible format found
[   19.079648] nouveau 0000:01:00.0: [drm] Cannot find any crtc or sizes
[   28.727620] nouveau 0000:01:00.0: fb: VPR locked, but no scrubber binary!

An interesting part here is that when I’m booting the PC and pressing esc button, I’m able to choose a kernel and it’s showing me to installed available kernels: kernel-6.2.6-300… and kernel-6.4.15-200…
If I choose the former one (i.e., older), everything loads properly and nvidia-smi works and shows nvidia gpu. But if I choose the later one (newer), then I even receive a message that “NVIDIA kernel module missing. Falling back to nouveau” and of course I can’t load nvidia-smi. Is there a way to install properly only the later kernel/driver leaving the older one untouched (in order to revert to it if smth goes wrong)?

Thank you!

Oh, and another strange thing: as I found on the internet, when one writes a command sudo unxz /lib/modules/6.4.15-200.fc38.x86_64/extra/nvidia.ko it should find a file. BUT I can only find sudo unxz /lib/modules/6.4.15-200.fc38.x86_64/extra/nvidia.ko.xz… (Notice extension at the end).

I think I see the issue with the output of the dnf list installed '*nvidia*' command and the dmesg output. Notice from the dmesg output that it clearly shows the firmware was not available.
The nvidia-gpu-firmware package seems to be missing.

Please reinstall it with either sudo dnf reinstall linux-firmware or sudo dnf install nvidia-gpu-firmware. After ensuring that package is installed so the firmware for the gpu is available then reboot and it should work.

This is one of the problems with blindly following instructions on the internet. Fedora uses the compressed kernel modules with the xz extensions and that command was for much older releases where some kernels still preferred the uncompressed modules. The modules should not normally be uncompressed manually since that happens as the kernel loads them.

This shows the modules were never properly built for the 6.4.15 kernel.
After installing the firmware package and before the reboot please run sudo akmods --force and allow that command to return a command line prompt before rebooting.

1 Like

Thank you for providing steps to try, I’ll do them to see if they help. But I had one question regarding this procedure to reinstall the firmware: would this leave my kernel-6.2.9-300... untouched so that I would be able to revert to it (as I can do now) in case of some failure? Or should I specify some parameter that tells installer to leave that one untouched?

Regarding the extensions, I asked that because I wanted to try to manually sign this module if the failure was connected to it.

Thank you.

You did not provide the full needed info to answer your question.

  1. kernel 6.2.9 was the original kernel released when fedora 38 was first released. That kernel is still in the fedora repo though none of the later kernels except the most recent are retained in the updates repo.

  2. installing the nvidia-gpu-firmware package with the commands given above will not disturb the installed kernel.

  3. once the firmware package is reinstalled and you use the akmods command it will also not affect the currently installed kernel.

  4. if you already have 3 kernel versions installed and you do any updates to the system that does upgrade the kernel it automatically removes the oldest installed kernel except that it should never remove the kernel that is currently running. Thus, if you boot from the 6.2.9 kernel then do a sudo dnf upgrade command the 6.2.9 kernel would not be removed during the upgrade which installs a newer kernel. The command dnf list installed kernel will show which kernels are currently installed.

  5. Manually signing the kernel modules is relatively simple and I use secure boot with signed modules. Follow the instructions in the file /usr/share/doc/akmods/README.secureboot and it should automatically sign the modules as they are built.
    NOTE: if the modules have already been built for the currently installed kernel, (which you can see by looking at the output of dnf list installed kernel kmod-nvidia-* and comparing the kernel versions in both packages) then it will be necessary to remove the kmod package for the newest kernel before building the modules for the newest kernel, or alternatively you could build a signed module for the specified kernel with a slightly different akmods command.
    For example, if you have kernel package installed that appears like this

$ dnf list installed kernel
Installed Packages
...
kernel.x86_64               6.4.15-200.fc38          @updates

then the actual full package name is kernel-6.4.15-200.fc38.x86_64 and when booted to that kernel the command uname -r would show 6.4.15-200.fc38.x86_64. You can then build a kernel module for that specific kernel with the command sudo akmods --kernels 6.4.15-200.fc38.x86_64 --force and it would build the signed module for only that listed kernel version and replace whatever was already there. The info or man page for akmods would show this.

I’m trying to provide all the information that you might need, but just don’t know what should it be exactly.

So I tried to reinstall linux-firmware packages, installed nvidia-gpu-packages, but it still behaves strangely. I’ll try to list some outputs.

  1. I booted rn from the kernel-6.4.15-200.fc38.x86_64.
  2. I saw “NVIDIA kernel module missing falling back to nouveau”.
uname -r
6.4.15-200.fc38.x86_64
dnf list installed kernel kmod-nvidia-*
Installed Packages
kernel.x86_64                                             6.4.15-200.fc38              @updates     
kmod-nvidia-6.2.9-300.fc38.x86_64.x86_64                  3:535.104.05-1.fc3        @@commandline
sudo akmods --kernels 6.4.15-200.fc38.x86_64 --force
Checking kmods exist for 6.4.15-200.fc38.x86_64Warning: Cou[  OK  ]etermine what package owns /lib/modules/6.4.15-200.fc38.x86_64/extra/nvidia/
Checking kmods exist for 6.2.9-300.fc38.x86_64             [  OK  ]

So it looks there is some problem with building the newest kernel…
I’m a bit lost tbh as I’m quite new to Linux.

If you know where to proceed next, I would appreciate it.

That seems to indicate that the directory already exists. Does it?
It should have been created and populated when the kernel module is built and the package kmod-nvidia for that kernel is installed. If it was somehow created in another fashion that warning is reminding you of the fact, and it must be removed before the kmod-nvidia-6.4.15-200.fc38.x86_64 package, which is locally built by akmods, can be cleanly installed.

If that directory exists already then please remove it with sudo rm -r /lib/modules/6.4.15-200.fc38.x86_64/extra/nvidia then repeat the akmods command

Yes, it looks it’s there already.

Should I do it when booting from 6.4.15-200 kernel or from the older one? Or it doesn’t matter at all?

I would suggest boot with the new kernel.If that doesn’t work just reboot into the older kernel.If you followed what @computersavvy posted it should work.

I think I was a bit wrong. When I’m trying to remove this directory there is an error indicating that No such file or directory. Then it becomes even stranger as it’s downloaded, there is no directory for this kernel, and I can’t build it…

Please, Please post the actual commands and the results.
We cannot see your screen and your interpretation of the messages may be different than ours. Use the preformatted text (</>) button on the tool bar to paste screen text and retain formatting. The blockquote button does not retain formatting.

Please show us the result (with command) of
sudo ls /lib/modules/6.4.15-200.fc38.x86_64/
and sudo ls /lib/modules/6.4.15-200.fc38.x86_64/extras/

What are you downloading. At this stage and with what you posted nothing should be downloaded, only the import into the bios.

sudo ls /lib/modules/6.4.15-200.fc38.x86_64/
build	kernel		   modules.block	      modules.builtin.bin      modules.dep.bin	modules.modesetting  modules.softdep	  source      systemtap  vmlinuz
config	modules.alias	   modules.builtin	      modules.builtin.modinfo  modules.devname	modules.networking   modules.symbols	  symvers.xz  updates	 weak-updates
extra	modules.alias.bin  modules.builtin.alias.bin  modules.dep	       modules.drm	modules.order	     modules.symbols.bin  System.map  vdso
sudo ls /lib/modules/6.4.15-200.fc38.x86_64/extra
nvidia.ko.xz

Maybe this package you wanted me to delete?

This is not even a proper location and not a full list of modules.

$ ls /lib/modules/6.4.15-200.fc38.x86_64/extra 
nvidia  v4l2loopback  VirtualBox

$ ls /lib/modules/6.4.15-200.fc38.x86_64/extra/nvidia
nvidia-drm.ko.xz  nvidia.ko.xz  nvidia-modeset.ko.xz  
nvidia-peermem.ko.xz  nvidia-uvm.ko.xz

Remove that file then do the install with akmods as previously suggested.