I am running Fedora now for around 7 years and this is the first time I am running into problems with my Nvidia graphics card.
For many years I was using the proprietary Nvidia driver packages without problems. I switched to the RPM Fusion drivers around 6 months ago which also ran without problems.
Now, after the last kernel update my Fedora wasn’t booting anymore. First, I thought the Grub config was damaged and I tried to repair it. But after doing so nothing changed. I even wasn’t able to boot into run level 3 anymore.
I thought I messed up something and as I have all my files backed up, I reinstalled Fedora from my live USB stick. It was running Wayland and the nouveau driver. So, I did what I also did before:
I activated the RPM Fusion repos and Installed the Nvidia driver packages as described on the RPM Fusion page (Howto/NVIDIA - RPM Fusion). On reboot, my laptop got stuck on a black screen (which I knew before when I was using the proprietary Nvidia driver packages, where I had to re-run the installer occasionally). So, again, I thought I messed up something and reinstalled Fedora again and ran the same things again to check for some errors. But I still got stuck with the black screen after booting.
So I tried to install the proprietary Nvidia driver package as I’ve done before. Same problem in the end.
Now, what I found out is, that it seems the Nvidia kernel modules weren’t build when installing the Nvidia packages (akmod-nvidia).
Is it possible that the current kernel 5.13.9-200 isn’t supporting the available Nvidia driver packages? Whenever a new kernel version was released, usually the Nvidia drivers where compiled on my system which I suppose isn’t happening anymore.
I am running a Nvidia Quadro M2000M graphics card in my laptop.
On my system running the same kernel and using the rpmfusion packages the nvidia drivers are built properly.
You have installed the nvidia driver from nvidia which does not automatically get updated and MUST be reinstalled or manually rebuilt with each kernel update. The reinstall handles the rebuild of the driver but requires manual attention.
This is one of the advantages of using the drivers from rpmfusion. The akmod-nvidia package and its supporting dependencies take care of rebuilding the kmod-nvidia package for you with each kernel or driver update so there is no hassle with needing to do it manually.
After waiting a few minutes, I rebooted the laptop and again, i am running into a black screen.
I haven’t installed anything else on this computer. Also, I ran two hardware checks, yesterday and today and both said the graphics card is completely fine without any problems.
I absolutely don’t know what’s the problem right now. Also, if the card wouldn’t work then obviously it also shouldn’t run with the nouveau driver that is installed during the first Fedora setup. Isn’t it?
Maybe it is related to exactly what got installed. Can you do a “Ctrl-Alt-F3” and get to the text screen?
If you can do that then please post the output of the command
"sudo dnf list installed ‘*nvidia*’ " It can be redirected to a file then the file can be used to send the output.
I just checked if I can start the alternate kernel version 5.11.12-300 from the Grub boot menu and the splash screen says “Nvidia kernel module is missing. Falling back to nouveau”.
That’s one step further than booting with the 5.13.9-200 kernel.
Do you have the kernel-devel and kernel-headers packages?
Also I note you are missing several packages from my list. (I do have more than the minimum though.)
Specifically nvidia-persistenced nvidia-modprobe nvidia-xconfig xorg-x11-drv-nvidia-cuda xorg-x11-drv-nvidia-cuda-libs xorg-x11-drv-nvidia-devel
I also note that the kmod-nvidia package was actually built, but apparently the module is not getting loaded.
Try installing the other packages and see what the results are. All are available from rpmfusion so they should install with just a "dnf install … " command
One more thing to look at, when booting select the 5.13.9 kernel and press “e” for edit, then look at what is on the line beginning with “linux”. It is possible that did not get properly reconfigured and it should look something like this
linux "rd.driver.blacklist=nouveau modprobe.blacklist=nouveau nvidia-drm.modeset=1 rd.lvm.lv=fedora/root rhgb quiet"
So, I checked the Grub menu and for the kernel 5.13.9-200 I determined the following:
With the Nvidia driver settings I even can’t boot into runlevel 3. As the display drivers are not loaded in runlevel 3, I suppose that there is some kernel problem in the previous steps when the Nvidia drivers are (pre)loaded?
But when I remove the rd.driver.blacklist=nouveau and the modprobe.blacklist=nouveau options from the Grub boot menu, I am able to boot into runlevel 3 without problems.
With what you just said, I suggest that from run level 3 after you install the extra packages I suggested above you should try a reinstall of the kernel to force everything to rebuild.
“sudo dnf reinstall kernel*5.13.9”
Hopefully that with the extra packages will fix the issue with not loading the video drivers.
I booted into runlevel 3 and installed all of the packages mentioned above. Also I reinstalled the kernel*5.13.9* Package which reinstalled:
kernel
kernel-core
kernel-devel
kernel-modules
kernel-modules-extra
kernel-headers is already installed.
Also, the grub bootloader line looks like it should, blacklisting the nouveau drivers. Ans still I am running into the black screen when booting the kernel with the Nvidia drivers.
try, when booted to run level 3,
“cat /etc/default/grub” and post the content.
Also post the output of
"lsmod | grep -e “nvidia|nouveau”
and
“uname -a”
Edit:
One thing I had not considered since you have been using nvidia in the past was “secure boot”
Please make certain that is disabled then try again.
Also, you can watch the progress of the boot and see where it ends by removing “rhgb quiet” from the end of the linux command line in grub so it displays the progress as text.
Try stepping through the latter part of that post, one thing at a time. There are comments about several different things, and one is all it takes if you get the correct one.
Also, would you please post the output of “inxi -Gxx” so we can see the exact info that it provides about the GPU(s). You may need to install inxi.
I already removed rhgb quiet from the grub config line but the output at the boot sequence is so fast, that it’s impossible to read anything. And at the end the output completely disappears.
since it should only have the entries in there once. Once you reduce it to the given then you will need to run “grub2-mkconfig -o /etc/grub2-efi.cfg” to recreate the grub.cfg file and eliminate grub having multiple entries of the same thing on the command line.
Inxi clearly shows that the nouveau driver is loaded, so I am sure that the lsmod command above will only return nouveau results.
Apparently your laptop only has the one GPU, unless for some reason the system is not seeing the IGP. If it is supposed to have two like most of the newer ones do then maybe post the results of “lspci -nnk” so we can see if there actually are 2 GPUs.
Also, did you by chance edit /etc/gdm/custom.conf as suggested and make sure that “#WaylandEnable=false” was changed to “WaylandEnable=false”.? I doubt that is needed (mine is not) but for some it seems so.
nouveau 2400256 1
drm_ttm_helper 16384 1 nouveau
ttm 77824 2 drm_ttm_helper,nouveau
i2c_algo_bit 16384 1 nouveau
drm_kms_helper 290816 1 nouveau
mxm_wmi 16384 1 nouveau
drm 630784 5 drm_kms_helper,drm_ttm_helper,ttm,nouveau
wmi 36864 7 intel_wmi_thunderbolt,dell_wmi,wmi_bmof,dell_smbios,dell_wmi_descriptor,mxm_wmi,nouveau
video 57344 3 dell_wmi,dell_laptop,nouveau
since it should only have the entries in there once. Once you reduce it to the given then you will need to run “grub2-mkconfig -o /etc/grub2-efi.cfg” to recreate the grub.cfg file and eliminate grub having multiple entries of the same thing on the command line.
Yep, I edited the grub file and recreated the config file. Looks good now as it should.
Apparently your laptop only has the one GPU, unless for some reason the system is not seeing the IGP. If it is supposed to have two like most of the newer ones do then maybe post the results of “lspci -nnk” so we can see if there actually are 2 GPUs.
Here’s the output of the lspci command:
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:1910] (rev 07)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: skl_uncore
00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
Kernel driver in use: pcieport
00:04.0 Signal processing controller [1180]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem [8086:1903] (rev 07)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: proc_thermal
Kernel modules: processor_thermal_device
00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: xhci_hcd
00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: intel_pch_thermal
Kernel modules: intel_pch_thermal
00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: mei_me
Kernel modules: mei_me
00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: ahci
00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #2 [8086:a111] (rev f1)
Kernel driver in use: pcieport
00:1c.2 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #3 [8086:a112] (rev f1)
Kernel driver in use: pcieport
00:1c.4 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #5 [8086:a114] (rev f1)
Kernel driver in use: pcieport
00:1d.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 [8086:a118] (rev f1)
Kernel driver in use: pcieport
00:1f.0 ISA bridge [0601]: Intel Corporation CM236 Chipset LPC/eSPI Controller [8086:a150] (rev 31)
Subsystem: Dell Device [1028:06d9]
00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
Subsystem: Dell Device [1028:06d9]
00:1f.3 Audio device [0403]: Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller [8086:a170] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: e1000e
Kernel modules: e1000e
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GLM [Quadro M2000M] [10de:13b0] (rev a2)
Subsystem: Dell Device [1028:16d9]
Kernel driver in use: nouveau
Kernel modules: nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
Subsystem: Dell Device [1028:16d9]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
02:00.0 Network controller [0280]: Intel Corporation Wireless 8260 [8086:24f3] (rev 3a)
Subsystem: Intel Corporation Device [8086:0050]
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a] (rev 01)
Subsystem: Dell Device [1028:06d9]
Kernel driver in use: rtsx_pci
Kernel modules: rtsx_pci
3d:00.0 Non-Volatile memory controller [0108]: Toshiba Corporation XG4 NVMe SSD Controller [1179:0115] (rev 01)
Subsystem: Toshiba Corporation Device [1179:0001]
Kernel driver in use: nvme
Kernel modules: nvme
Also, did you by chance edit /etc/gdm/custom.conf as suggested and make sure that “#WaylandEnable=false” was changed to “WaylandEnable=false”.? I doubt that is needed (mine is not) but for some it seems so.
Yes, I already activated this in the /etc/gdm/custom.conf file but showed no changes. Also, when running the older Kernel versions before I never had to activate this part here.
Apparently your laptop only has the one GPU, unless for some reason the system is not seeing the IGP. If it is supposed to have two like most of the newer ones do then maybe post the results of “lspci -nnk” so we can see if there actually are 2 GPUs.
Afaik there is only the Nvidia Quadro GPU. I’ve never seen any systems information that there is an IGP in my system. When the Nvidia drivers refused to re-compile (which occurred two or three times in the last years) then the systems info would show that the graphics were running on LLVM Pipe which showed it was software rendering mode.
It seems you have tried almost everything, so lets approach this differently if you are willing.
install fedora fresh
do a full “dnf upgrade” and reboot to ensure it is booting properly and using the video correctly.
enable and install the nvidia drivers using the copr link above.
reboot
Hopefully this sequence will enable the drivers with no further problems. If this fails then I really have no further suggestions since this sequence is intended to do the install and get the video working properly without interference from any other software installation attempts.
From the copr link, all I need is to run after installation is:
sudo nvautoinstall --rpmadd
and
sudo nvautoinstall --driver
right?
My GPU seems to be working, as it is recognized by the system as seen in the output of the lspci command and also it seems to work with the nouveau driver. Also, the hardware test showed absolutely no problems and at last it would be obvious to see if the GPU had some hardware damage.