Nvidia boot loop with rd.driver.blacklist=nvidia

Hello,

I always thought that rd.driver.blacklist=nvidia as kernel parameter in grub would inhibit loading the nvidia kernel modules. However, I just learnt that this is not the case.

I’m struggling with a ‘blank screen’ while booting into f36. Here’s an extract of journalctl --no-hostname -k -b

Sep 03 16:58:56 kernel: microcode: microcode updated early to revision 0xf0, date = 2021-11-15
Sep 03 16:58:56 kernel: Linux version 5.18.19-200.fc36.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 12.1.1 20220507 (Red Hat 12.1.1-1), GNU ld version 2.37-27.fc36) #1 SMP PREEMPT_DYNAMIC Sun Aug 21 15:52:59 UTC 2022
Sep 03 16:58:56 kernel: Command line: BOOT_IMAGE=(hd1,gpt6)/vmlinuz-5.18.19-200.fc36.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-00 rd.lvm.lv=fedora/root rd.luks.uuid=luks-8cfdfb51-cdfa-401a-9815-d3be9a527942 rd.lvm.lv=fedora/00 rd.driver.blacklist=nvidia
Sep 03 16:58:56 kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
...
ep 03 14:59:23 kernel: xfs filesystem being mounted at /mnt/opt supports timestamps until 2038 (0x7fffffff)
Sep 03 14:59:26 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 510
Sep 03 14:59:26 kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s).
Sep 03 14:59:26 kernel: NVRM: This can occur when a driver such as: 
                        NVRM: nouveau, rivafb, nvidiafb or rivatv 
                        NVRM: was loaded and obtained ownership of the NVIDIA device(s).
Sep 03 14:59:26 kernel: NVRM: Try unloading the conflicting kernel module (and/or
                        NVRM: reconfigure your kernel without the conflicting
                        NVRM: driver(s)), then try loading the NVIDIA kernel module
                        NVRM: again.
Sep 03 14:59:26 kernel: NVRM: No NVIDIA devices probed.

The messages beginning with kernel: nvidia-nvlink: are repeated over and over, screen is blank, no login prompt or anything.

I do not fully understand why the kernel is trying nvidia stuff in the first place. Any suggestions how to avoid the problem?

Kind regards,

aanno

I cannot tell from that log snippet what you are talking about.
Since you selected a short snippet of the log it is impossible for us to have any idea of what was or was not successfully done during boot.

  1. There appears to be an nvidia GPU so the kernel will always try to load a driver module for it. The 2 potential modules for loading are nouveau (open source) and nvidia (proprietary). Ev en if the nvidia drivers are installed and the nouveau driver is blacklisted, the kernel will fall back to the nouveau module if it cannot load the nvidia drivers.
  2. The only affect rd.driver.blacklist=nvidia would have might be to prevent the kernel from loading the actual proprietary nvidia modules if you had them installed.

When you have the black screen there usually is a way to bypass that, often with a ctrl-alt-F3 or similar to get a text screen.

We need a much larger segment of the output of journalctl -b -0 and/or dmesg so we can tell what the system is telling you during boot. It would also help if you were to post the output of inxi -Fzxx

Finally it would help to know which driver module is actually loaded with the output of lsmod | grep -iE 'nouveau|nvidia'

Sorry for replying late. My post is probably invalid. I encountered problems with nvidia (and cuda) on f36. I posted a solution at Bug report on nvidia driver 515.65.01 for fedora 36 (kernel 5.18.19, RTX 2060 Rev. 1) - #8 by aannoaanno - Linux - NVIDIA Developer Forums which seems to be the right place for problems with the proprietary nvidia kernel modules.