Fedora 40, nvidia-drivers and secure boot

Recently, I switched to Fedora KDE spin on my home ASUS laptop. I did use Fedora 7 years ago, but switched to Windows for gaming, and only let my work computer run Fedora. Since gaming has become quite smooth on Linux, I decided to switch back.
I disabled Secure Boot just to be sure and followed the instructions by fedora and the complementary ones on asus-linux.org. By setting the GPU mode to AsusMuxDgpu everything works great.
Note however that I installed the nvidia-driver:open-dkms from the repo cuda-fedora39-x86_64 per the nvidia CUDA installation instructions, assuming they contain the new open-source kernel module so they do not need to be compiled every time they are updated. I am not sure if this is correct, though.
I did want to enable Secure Boot, so I decided to try it out. Unfortunately, the nvidia module is not loaded because its key is not accepted. I assume this is because it comes from the nvidia CUDA repo.
Is there a way to sign nvidia module and not switch to the rpmfusion-nonfree version? Or am I just being stupid and the rpmfusion-nonfree is better anyway?

EDIT: I read this redhat article on Fedora 41 here, that mentions this will be a feature on this version.

EDIT2: I run modinfo -m nvidia command

filename:       /lib/modules/6.10.12-200.fc40.x86_64/extra/nvidia.ko.xz
import_ns:      DMA_BUF
alias:          char-major-195-*
version:        560.35.03
supported:      external
license:        Dual MIT/GPL
firmware:       nvidia/560.35.03/gsp_tu10x.bin
firmware:       nvidia/560.35.03/gsp_ga10x.bin
softdep:        pre: ecdh_generic,ecdsa_generic
srcversion:     81EA935CF563DD66E593C4A
alias:          pci:v000010DEd*sv*sd*bc06sc80i00*
alias:          pci:v000010DEd*sv*sd*bc03sc02i00*
alias:          pci:v000010DEd*sv*sd*bc03sc00i00*
depends:        
retpoline:      Y
name:           nvidia
vermagic:       6.10.12-200.fc40.x86_64 SMP preempt mod_unload 
sig_id:         PKCS#7
signer:         DKMS module signing key
sig_key:        21:09:F6:92:30:D4:70:FD:C7:B0:66:E9:AC:39:CB:E8:F7:4C:50:B5
sig_hashalgo:   sha512

I am pretty sure it is signed but the issue is the UEFI dbx (recently updated) blacklists the key.

1 Like

https://rpmfusion.org/Howto/NVIDIA

https://rpmfusion.org/Howto/Secure%20Boot

Recommended way to install nvidia is using rpm-fusion not any other thirdparty ways and rpm-fusion has cuda support too

1 Like

I am aware of this procedure, but when testing with CUDA Samples, the RPM fusion CUDA drivers do not work. The official drivers from NVIDIA are necessary, at least in my experience. I do some ML workflows and this is how it works for me currently.

I guess you need to create MOK key for your current drivers and sign them

I havent installed cuda drivers for fedira, but on openSUSE i had thwre nvidia repo installed and signed with secure boot and when installed cuda drivers from nvidia everything works still

So it is just creating MOK to sign nvidia drivers

Hi Orfeas, as a small offtopic from my side, after Installing NVIDIA drivers(but note I used the rpmfusion ones with mokutil steps) - I had to follow the separate steps for the cuda subpackages

https://rpmfusion.org/Howto/CUDA

As for testing… I am still playing around with nvidia… I was able to create this simple openmp program to test the gpu

user@fedora:~$ sudo dnf install -y gcc gcc-offload-nvptx

user@fedora:~$ cat <<EOF > main.c
#include "stdio.h"
#include "stdlib.h"
#include "omp.h"
#define N 3000
double
A[N][N], B[N][N], C[N][N];
int main(){
        // seed with random values
        for(int i=0;i<N;i++)
        for(int j=0;j<N;j++){
        A[i][j]=rand(); B[i][j]=rand();C[i][j]=0;
}

#pragma omp target teams distribute parallel for
for(int i=0;i<N;i++){
        for(int j=0;j<N;j++){
                for(int p=0;p<N;p++){
                        C[i][j] = A[i][p] * B[p][j] + C[i][j];
                }}}
}
EOF

user@fedora:~$ gcc main.c -fopenmp -foffload=nvptx-none

And I can clearly see the gpu running at 100%, although the performance is not quite what I expect… I guess openmp is an intermediary generic layer that maps structures at the cost of performance… As for CUDA… I am running this simple test

user@fedora:~$ cat <<EOF > cudatest.c
#include "stdio.h"
#include "cuda.h"
int main(){
    int version;
    CUresult result = cuDriverGetVersion(&version);
    printf("CUDA version %d", version);

}
EOF

user@fedora:~$  nvcc cudatest.c -lcuda && ./a.out
CUDA version 12050

At this point I conclude cuda is working although translating above example will probably require me looking closer into cuda.h

3 Likes

Users have often reported problems when installing drivers from one of the cuda-fedora repos. Those repos seem to be provided directly from nvidia.

The recommendation, as has previously been noted, is always to install the nvidia drivers from rpmfusion.

The switch is fairly simple.

  1. remove the currently installed drivers
    sudo dnf remove \*nvidia\* --exclude nvidia-gpu-firmware
  2. disable the cuda fedora repos sudo dnf config-manager disable cuda-fedora*
  3. install the nvidia drivers and cuda from rpmfusion sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda
  4. wait an adequate time for the driver modules to be locally compiled and installed then reboot.

I have seen very few (none?) reports where any additional actions were required.

Activating secure boot with the rpmfusion drivers is also relatively simple.
Install the akmods package then reboot after doing step 2 and before doing step 3. Then follow the steps in the file /usr/share/doc/akmods/README.secureboot to import the key into bios. Once that is done then install the driver in step 3 and it will be signed when installed so secure boot can be active.

2 Likes

As mentioned here, certain tools provided by the nvidia cuda repo are not in rpm-fusion repo, especially for AI workflows. As mentioned here, this will be remedied in Fedora 41.

2 Likes

Sorry, I don’t see anything in there about specific versions of CUDA tools being included in RPMFusion? The only reference to Nvidia drivers (aside from the Miracle spin) was to the ability of GNOME Software to trigger the Secure Boot MOK process steps that enable a graphical installation, but that would still be of the RPMFusion-managed Nvidia drivers?

This is not the solution of the said problem, whatever reporter thought.

The RPM Fusion packaged nvidia driver works as appropriate with the nvidia packaged cuda toolkit. It is better to use the rpmfusion packaged driver there because of the better fedora kernel support.

Please follow the post from Roman Gherta pointing at the correct Documentation.

As I mentioned above, the RPM fusion CUDA package is missing a lot of functionality, which can be seen by the seer difference in size. I will keep using the NVIDIA repo since it is necessary for ML workflows.

There is no such RPM Fusion CUDA package, we only provide the CUDA side “driver” that is part of the NVIDIA driver but we aims to use the CUDA toolkit unmodified from the nvidia repository. It’s just that we disable the nvidia-driver provided in this repository to use our packaged nvidia driver.

So as soon as you are using the CUDA toolkit , there cannot be a feature difference (there are the same packages).

Specially ML workload

Also you are kind of mixxing things up, as Fedora 41 will not remove the need of the NVIDIA cuda repository along the nvidia-driver.

Please only quote the official RPM Fusion CUDA documentation, as I’m not going to fix the Internet:
https://rpmfusion.org/Howto/CUDA

I am aware of the rpmfusion repo and have replaced the nvidia driver because the last update to the new kernel version had an issue. There is some weird behavior between dnf and dnf5 because dnf blacklists the nvidia packages but dnf5 does not. But this is a different issue.

Part of your original question was how to sign the Nvidia modules.

I don’t have an Nvidia card on the laptop I use with secure boot, but I use VirtialBox a lot and that makes me sign modules with every kernel update.

Aside of following the indications on the README.md provided by @computersavvy , you may want also to check out a script I wrote to make the process a bit easier

1 Like

If you use virtualbox installed directly from oracle then manually signing the modules may be required.

If you were to remove virtualbox that was installed from oracle and install virtualbox from the rpmfusion repo then the driver module would be rebuilt and signed by akmods with every kernel upgrade.

# dnf install VirtualBox
Last metadata expiration check: 1:26:09 ago on Tue 15 Oct 2024 11:05:43 AM CDT.
Dependencies resolved.
====================================================================================================================================
 Package                           Architecture          Version                        Repository                             Size
====================================================================================================================================
Installing:
 VirtualBox                        x86_64                7.1.2-1.fc40                   rpmfusion-free-updates                 25 M
Installing dependencies:
 VirtualBox-kmodsrc                noarch                7.1.2-1.fc40                   rpmfusion-free-updates                939 k
 VirtualBox-server                 x86_64                7.1.2-1.fc40                   rpmfusion-free-updates                 21 M
 akmod-VirtualBox                  x86_64                7.1.2-1.fc40                   rpmfusion-free-updates                 25 k
 liblzf                            x86_64                3.6-28.fc40                    fedora                                 28 k

Transaction Summary
====================================================================================================================================
Install  5 Packages

Somehow it doesn’t work that way: I have virtualbox installed from RPM Fusion and I have to manually sign the modules every time (Fedora 40, still to see if that works with 41)

If there’s something I was supposed to manually do for the modules to be automatically built and signed, like generate and store the MOKs following a somehow specific procedure, there’s no indication or documentation about it whatsoever.

Please explain what you mean here.
I use both VirtualBox and nvidia drivers from rpmfusion and when a kernel is upgraded the modules are automatically signed for secure boot.

$ lsmod | grep box
vboxnetadp             32768  0
vboxnetflt             40960  0
vboxdrv               704512  2 vboxnetadp,vboxnetflt


$ modinfo vboxdrv
filename:       /lib/modules/6.11.3-200.fc40.x86_64/extra/VirtualBox/vboxdrv.ko.xz
version:        7.1.2_rpmfusion r164945 (0x00340001)
license:        GPL
description:    Oracle VirtualBox Support Driver
author:         Oracle and/or its affiliates
srcversion:     1D640EFCE74255CFEC18A1C
depends:        
retpoline:      Y
name:           vboxdrv
vermagic:       6.11.3-200.fc40.x86_64 SMP preempt mod_unload 
sig_id:         PKCS#7
signer:         fedora-44340853
sig_key:        62:E1:18:06:82:79:7A:DA:D5:16:C0:F2:2D:D1:FF:3F:70:CF:A5:B0
sig_hashalgo:   sha256
---trimmed----

Yes, I understood that. It’s not my case: as I said, I have to sign them manually every time.

If I don’t, the virtual machine won’t boot because the module won’t be load by the kernel due to it being unsigned --as you could imagine.

My point here is that something is missing in my system in a very obvious way: I am not saying that what you say is wrong, or false, or whatever. If you want to make this discussion productive, please point me in the right direction as it seems obvious to me that I need to do something myself in order to enable that.

Edit: next time I get a new Kernel I will try to boot a VM without explicitly signing, just in case this is somehow new within the last few updates of rpmfusion or fedora 40, anyway.

Just to prove the situation: I uninstalled all VirtualBox component and re-installed them. The vboxdrv module from RPMFusion is not inserted:

gvisoc@vao:~$ systemctl --failed
  UNIT            LOAD   ACTIVE SUB    DESCRIPTION                    
● vboxdrv.service loaded failed failed Linux kernel module init script

Legend: LOAD   → Reflects whether the unit definition was properly loaded.
        ACTIVE → The high-level unit activation state, i.e. generalization of SUB.
        SUB    → The low-level unit activation state, values depend on unit type.

1 loaded units listed.
gvisoc@vao:~$ systemctl status vboxdrv
× vboxdrv.service - Linux kernel module init script
     Loaded: loaded (/usr/lib/systemd/system/vboxdrv.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: exit-code) since Wed 2024-10-16 11:01:13 AEDT; 40s ago
 Invocation: af81ae9626294a169f7e4e50e80ff052
    Process: 1280 ExecStart=/sbin/modprobe vboxdrv (code=exited, status=1/FAILURE)
   Main PID: 1280 (code=exited, status=1/FAILURE)
   Mem peak: 3.3M
        CPU: 16ms

Oct 16 11:01:13 vao systemd[1]: Starting vboxdrv.service - Linux kernel module init script...
Oct 16 11:01:13 vao modprobe[1280]: modprobe: ERROR: could not insert 'vboxdrv': Key was rejected by service
Oct 16 11:01:13 vao systemd[1]: vboxdrv.service: Main process exited, code=exited, status=1/FAILURE
Oct 16 11:01:13 vao systemd[1]: vboxdrv.service: Failed with result 'exit-code'.
Oct 16 11:01:13 vao systemd[1]: Failed to start vboxdrv.service - Linux kernel module init script.

EDIT: Most relevant line of the ouput is the “modprobe: ERROR: could not insert ‘vboxdrv’: Key was rejected by service”. I understand that that is the place where something is missing. A signing key, a program relevant to the trust chain, a non-declared dependency, or something of the like.

I am happy to take this offline if you can help me find a solution – I find absolutely no joy on signing modules myself.

Yeah, right. So apparently this all happens automatically if I manually insert the key provided by akmods package into your system --it’s not Fedora’s already existing and enrolled key, but an additional one.

That was the missing part, and after enrolling the key the module doesn’t get rejected.

sudo mokutil --import /etc/pki/akmods/certs/public_key.der

What is puzzling is that my module doesn’t have the same signer than yours, for whatever reason:

gvisoc@vao:~$ modinfo vboxdrv
filename:       /lib/modules/6.11.3-300.fc41.x86_64/extra/VirtualBox/vboxdrv.ko.xz
version:        7.1.2_rpmfusion r164945 (0x00340001)
license:        GPL
description:    Oracle VirtualBox Support Driver
author:         Oracle and/or its affiliates
srcversion:     1D640EFCE74255CFEC18A1C
depends:        
retpoline:      Y
name:           vboxdrv
vermagic:       6.11.3-300.fc41.x86_64 SMP preempt mod_unload 
sig_id:         PKCS#7
signer:         vao-3503722010

PS – “vao” is my hostname.

1 Like