aviadb99
(Aviad Ben)
October 18, 2021, 7:39am
1
Hi,
Nvidia-smi gives an error:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
I’ve read that secure boot and signed kernel might be the problem.
How can I check if secure boot is enabled or not? How do I install a non signed kernel to bypass the nvidia drivers problem?
Here is the dmesg output greping secure
$ dmesg | grep -i secure
[ 0.000000] secureboot: Secure boot disabled
[ 0.006788] secureboot: Secure boot disabled
[ 0.706242] integrity: Loaded X.509 cert 'Fedora Secure Boot CA: fde32599c2d61db1bf5807335d7b20e4cd963b42'
$ mokutil --sb-state
SecureBoot disabled
Platform is in Setup Mode
Thanks
Are the nvidia drivers installed and loaded?
“lsmod | grep nvidia” should give something like
# lsmod | grep nvidia
nvidia_drm 69632 5
nvidia_modeset 1200128 7 nvidia_drm
nvidia_uvm 1175552 2
nvidia 35332096 512 nvidia_uvm,nvidia_modeset
drm_kms_helper 303104 1 nvidia_drm
drm 630784 9 drm_kms_helper,nvidia,nvidia_drm
It does not seem to be related to secure boot since secure boot is disabled. Secure boot would prevent loading the nvidia drivers if it were active with current signed kernels since the nvidia modules are unsigned.
I see this in dmesg with nvidia. You should see similar.
# dmesg | grep -i nvidia
[ 0.000000] Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.9-200.fc34.x86_64 root=/dev/mapper/fedora-root ro rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 rd.lvm.lv=fedora/root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[ 0.000000] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.9-200.fc34.x86_64 root=/dev/mapper/fedora-root ro rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 rd.lvm.lv=fedora/root rhgb quiet rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1
[ 7.373694] nvidia: loading out-of-tree module taints kernel.
[ 7.373703] nvidia: module license 'NVIDIA' taints kernel.
[ 7.388279] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 7.409944] nvidia-nvlink: Nvlink Core is being initialized, major device number 236
[ 7.410222] nvidia 0000:08:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
Note that my kernel is stock from the fedora repo and I have no problems as long as secureboot is disabled.