Help me stop from moving to Windows... (F38 black screen issues)

I recently left the Mac-O-Sphere and built a PC with the intention of going to Fedora full time. I am having an issue with the nvidia drivers I believe. If this cant get solved I will have to use Windows full time as I need to be operational reliably to work.

The problem: 30 sec after full boot and landing at the Fedora desktop the screen goes black. Only way to recover is to power off the machine. This happens on the first boot, sometimes the second, then it doesn’t happen again (untill I power off and then the cycle begins again).

It happens on either Wayland/X, and I have tried the “nvidia-drm.modeset=1” to grub and it makes no difference either way.

I installed the nvidia driver following the rpmfusion guide here.
sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda

Quite certain my hardware is fine as I have run the stress tests and also have Windows 10 Pro on a second drive that functions properly. I have re-installed F38 twice, same results. Possible the drivers don’t yet support the hardware?

I could really use some help…

Hardware:

  1. Ryzen 7950X3D
  2. AsRock X670E Taichi
  3. Gigabyte GeForce RTX 4090 OC 24mb
  4. G.Skill Trident Z5 Neo RGB 64 GB (2 x 32 GB) DDR5-6000 CL32
$ sudo cat /proc/cmdline
BOOT_IMAGE=(hd1,gpt2)/vmlinuz-6.3.7-200.fc38.x86_64 root=UUID=fc95995f-ed0f-450b-8dd3-d8e88358921c ro rootflags=subvol=root rd.driver.blacklist=nouveau modprobe.blacklist=nouveau
$ lspci -nnv | grep -A2 -i VGA
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD102 [GeForce RTX 4090] [10de:2684] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: Gigabyte Technology Co., Ltd Device [1458:40bf]
        Flags: bus master, fast devsel, latency 0, IRQ 211
--
5c:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev c9) (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e]
        Flags: bus master, fast devsel, latency 0, IRQ 86
$ dnf list installed '*nvidia*'
Installed Packages
akmod-nvidia.x86_64							3:530.41.03-1.fc38		@rpmfusion-nonfree
kmod-nvidia-6.3.7-200.fc38.x86_64.x86_64	3:530.41.03-1.fc38		@@commandline
nvidia-gpu-firmware.noarch					20230515-150.fc38		@updates
nvidia-persistenced.x86_64 					3:530.41.03-1.fc38		@rpmfusion-nonfree
nvidia-settings.x86_64						3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia.x86_64					3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda.x86_64				3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda-libs.x86_64		3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia-kmodsrc.x86_64			3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia-libs.x86_64				3:530.41.03-1.fc38		@rpmfusion-nonfree
xorg-x11-drv-nvidia-power.x86_64			3:530.41.03-1.fc38		@rpmfusion-nonfree
$ modinfo -F version nvidia
530.41.03
$ inxi -Fzxx
System:
  Kernel: 6.3.7-200.fc38.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.39-9.fc38 Desktop: KDE Plasma v: 5.27.5 tk: Qt v: 5.15.9 wm: kwin_x11
    dm: SDDM Distro: Fedora release 38 (Thirty Eight)
Machine:
  Type: Desktop Mobo: ASRock model: X670E Taichi serial: <superuser required>
    UEFI: American Megatrends LLC. v: 1.24 date: 05/23/2023
CPU:
  Info: 16-core model: AMD Ryzen 9 7950X3D bits: 64 type: MT MCP arch: Zen 4
    rev: 2 cache: L1: 1024 KiB L2: 16 MiB L3: 128 MiB
  Speed (MHz): avg: 2973 high: 3550 min/max: 3000/5759 boost: enabled cores:
    1: 2998 2: 3000 3: 3000 4: 2997 5: 3000 6: 3000 7: 3000 8: 3000 9: 3000
    10: 3000 11: 2847 12: 3000 13: 2780 14: 2862 15: 3000 16: 2807 17: 2996
    18: 3000 19: 3000 20: 3000 21: 3001 22: 3000 23: 3001 24: 3000 25: 2877
    26: 3550 27: 2878 28: 2878 29: 2948 30: 3000 31: 2872 32: 2861
    bogomips: 268826
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: NVIDIA AD102 [GeForce RTX 4090] vendor: Gigabyte driver: nvidia
    v: 530.41.03 arch: Lovelace pcie: speed: 16 GT/s lanes: 16 ports:
    active: none off: DP-5 empty: DP-4,DP-6,HDMI-A-2 bus-ID: 01:00.0
    chip-ID: 10de:2684
  Device-2: AMD Raphael driver: amdgpu v: kernel arch: RDNA-2 pcie:
    speed: 16 GT/s lanes: 16 ports: active: none empty: DP-1, DP-2, DP-3,
    HDMI-A-1 bus-ID: 5c:00.0 chip-ID: 1002:164e temp: 37.0 C
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.9
    compositor: kwin_x11 driver: X: loaded: amdgpu,nvidia
    unloaded: fbdev,modesetting,nouveau,vesa alternate: nv dri: radeonsi
    gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1
  Screen-1: 0 s-res: 3840x2160 s-dpi: 162
  Monitor-1: DP-5 mapped: DP-2 note: disabled model: LG (GoldStar)
    ULTRAGEAR+ res: 3840x2160 dpi: 163 diag: 690mm (27.2")
  API: OpenGL v: 4.6.0 NVIDIA 530.41.03 renderer: NVIDIA GeForce RTX
    4090/PCIe/SSE2 direct-render: Yes
Audio:
  Device-1: NVIDIA AD102 High Definition Audio vendor: Gigabyte
    driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s lanes: 16
    bus-ID: 01:00.1 chip-ID: 10de:22ba
  Device-2: AMD Rembrandt Radeon High Definition Audio driver: snd_hda_intel
    v: kernel pcie: speed: 16 GT/s lanes: 16 bus-ID: 5c:00.1 chip-ID: 1002:1640
  Device-3: AMD Family 17h/19h HD Audio driver: snd_hda_intel v: kernel
    pcie: speed: 16 GT/s lanes: 16 bus-ID: 5c:00.6 chip-ID: 1022:15e3
  Device-4: Generic USB Audio driver: hid-generic,snd-usb-audio,usbhid
    type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 5-8:3 chip-ID: 26ce:0a06
  API: ALSA v: k6.3.7-200.fc38.x86_64 status: kernel-api
  Server-1: PipeWire v: 0.3.71 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin
Network:
  Device-1: Intel Wi-Fi 6 AX210/AX211/AX411 160MHz vendor: Rivet Networks
    driver: iwlwifi v: kernel pcie: speed: 5 GT/s lanes: 1 bus-ID: 4c:00.0
    chip-ID: 8086:2725
  IF: wlp76s0 state: up mac: <filter>
  Device-2: Realtek Killer E3000 2.5GbE vendor: ASRock driver: r8169
    v: kernel pcie: speed: 5 GT/s lanes: 1 port: a000 bus-ID: 4d:00.0
    chip-ID: 10ec:3000
  IF: enp77s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Intel AX210 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 bus-ID: 5-7:4 chip-ID: 8087:0032
  Report: rfkill ID: hci0 rfk-id: 2 state: up address: see --recommends
Drives:
  Local Storage: total: 1.82 TiB used: 6.3 GiB (0.3%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 PRO 1TB size: 931.51 GiB
    speed: 63.2 Gb/s lanes: 4 serial: <filter> temp: 54.9 C
  ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 980 PRO 1TB size: 931.51 GiB
    speed: 63.2 Gb/s lanes: 4 serial: <filter> temp: 35.9 C
Partition:
  ID-1: / size: 929.93 GiB used: 6.02 GiB (0.6%) fs: btrfs dev: /dev/nvme0n1p3
  ID-2: /boot size: 973.4 MiB used: 275.7 MiB (28.3%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 17.4 MiB (2.9%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 929.93 GiB used: 6.02 GiB (0.6%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 44.2 C mobo: N/A
  Fan Speeds (RPM): N/A
  GPU: device: nvidia screen: :0.0 temp: 47 C fan: 0% device: amdgpu
    temp: 38.0 C
Info:
  Processes: 631 Uptime: 1m Memory: available: 61.91 GiB used: 3.23 GiB (5.2%)
  Init: systemd v: 253 target: graphical (5) default: graphical Compilers:
  gcc: 13.1.1 Packages: pm: rpm pkgs: N/A note: see --rpm Shell: Bash
  v: 5.2.15 running-in: konsole inxi: 3.3.27

Please show the results of inxi -Fzxx.
Your output does not show the data for which GPU is used as primary nor other data about the hardware which inxi should provide.

The output of cat /proc/cmdline will help as well.

Ok added “inxi -Fzxx” to the op, and “cat /proc/cmdline” is already there (at the top). Thanks

Also added:
It happens on either Wayland/X, and I have tried the “nvidia-drm.modeset=1” to grub and it makes no difference either way.

Try one additional option on the kernel command line
nvidia-drm.modeset=1 initcall_blacklist=simpledrm_platform_driver_init
One or both those options at the same time may solve it. Both together solved it for me with the upgrade to the 6.3 series kernels.

Thanks I will give that additional option a try and report back…

Well seemed to work for a bit… but sadly the issue persists.

I did completely disable the iGPU and while it did change the boot sequence a bit it didn’t fix the issue.

Does caps lock light toggle when the screen goes black? Do you have another system that you can use to connect with ssh? This will allow you to determine if the system is hung or just the graphics device. Many of the systems I have used in the past 2 decades had only
Nvidia, so the ability to troubleshoot using ssh has been very useful.

It was locked up, nothing worked…

" Many of the systems I have used in the past 2 decades" - Do you have a recent 4090/ryzen9 7950x3d?

I reinstalled and used the latest/Beta drivers as a test… And while the black screen/lockup was gone nothing was able to use the gpu. The sensors could not see it, Programs like Blender, or video compression apps couldnt use the GPU/video compression libs etc…

Beginning to think Fedora (or all Linux?) is just not ready for the newest nvidia GPU/ Ryzen CPU combos.

If anyone reading this has a 4090 and Ryzen 9 and have Fedora running I’d like to hear from you …

Your inxi -Fzxx results show active:none for both Devices:

You have an LG display on DP-5, but detected as “off” by the GPU and “disabled” for the Montor-1 entry. All other ports are “empty”. Can you experiment with different cables and
connections?

Fedora runs more recent kernels than RHEL and other distros that offer long-term
support. New kernels generally include security improvements as well as support for new CPU’s.

For anyone interested I’ve clean reinstalled several times and still not usable. At this point it runs, and I have no black screens but things are not right. SDDM is crazy slow, the main OS has all kinds of flicker/tearing, flashing issues etc…

I have submitted bug reports at RPMFusion, and interestingly they have said to NOT use ‘nvidia-drm.modeset=1’ or ’ initcall_blacklist=simpledrm_platform_driver_init’ FYI.

Runs perfectly fine on W10Pro…

1 Like

When the screen goes black, can you switch to a Virtual Terminal via Ctrl+Alt+F3?

1 Like

No it is completely locked.

Linux is often not ready for the latest hardware until users find issues and document them in way that makes it easy for developers to fix.

We still can’t determine what state the system is in when “completely locked”. Often a non-responding system is still running, it is just the GUI that stopped working, so you can get a console session using Ctrl+Alt+F3 or log in from another system using ssh. If both those fail, a “wake-on-lan” magic packet may bring the system to life. If you can’t get into the “locked” system, the next thing to try is opening an ssh session from another system and using that to monitor the failing systems.

I clean installed with a non-Wayland OS (Fedora 38 Mate) and it worked perfectly, no issues. So its something relating to that.

Im waiting for the RPMFusion’s bug dev’s response atm…

It’s not that Fedora isn’t ready for the latest hardware, it has more to do with the hardware vendors and drivers keeping up with the latest kernel. When I was using Nvidia with Fedora, there were times where I’d have to stay back a kernel version because the Nividia drivers weren’t working with the latest kernel build.

3 Likes

Unfortunately this process has forced the need for me to abandon Linux/Fedora and use Windows… As far as I can tell at the time I was installing there were no driver compatible with the 4090 cards… and this forced my hand. These cards have been out almost a year now, seems ample time to get these supported to me, but I’m no dev…

Cant afford to be non-operational for more that a couple days… I was real excited to go Linux/Fedora full time, but wasn’t possible… Real disappointed…

I’m going to try it again today, but even if its perfect now I’ve already gone full in to the Windows ecosystem so probably Fedora will just be relegated to a toy to play with here and there… Bummed…

Within the past couple weeks the nvidia drivers from rpmfusion have been update from 530.41 to 535.54 (I upgraded mine 3 days ago). I am reasonably confident that one could install or upgrade the nvidia drivers on your system and the RTX 4090 card would be properly supported. Those drivers do, after all, originate from nvidia.com and are packaged by rpmfusion for use on fedora…

2 Likes

Well fingers crossed… I just did another clean install and so far so good… Seems to be working as expected…

Not sure what you mean by “another clean install” – did you apply updates and replace nouveau with 535.54 NVIDIA driver? Is Wayland working or are you staying with Xorg? This information will be helpful to the lurkers with similar hardware who have been avoiding F38 and Wayland.

Another meaning I have clean installed trying various things as directed by the RPMFusion dev’s several times, most recently using the beta driver version that worked “mostly”.

This time is the first clean install after the new driver release and it seems to be solid.

I installed F38, then updated completely untill no more updates. Next I installed following exactly the RPMFusion directions (using kmod not akmod per the note in the install directions). That resulted in the 535 driver version.

I’m using the default Wayland at login. Only thing I notice is that the mouse is slow/choppy at the sddm login screen sometimes, but is back to normal when logged in. So far it has not blacked out/locked up yet :crossed_fingers:t2: