Intermittent Crashes After Upgrading

I upgraded to Fedora 43 a couple of weeks ago and have been having intermittent crashes ever since. Up until today, all of the crashes happened while I was away from the workstation so I never saw it happen. Today, it has happened twice while I was at the workstation. It seems to be a GPU issue. The system instantly crashed to a black screen, then rebooted - no graceful shutdown. Here are some curated logs:

journalctl results

Jan 07 07:04:28 fedora kernel: NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64  580.119.02
Jan 07 07:04:28 fedora kernel: nvidia: loading out-of-tree module taints kernel.
Jan 07 07:04:28 fedora kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel
Jan 07 07:04:30 fedora nvidia-powerd[1511]: ERROR! Running on an unsupported system (PCI device Id: 0x2684)
Jan 07 07:04:30 fedora nvidia-powerd[1511]: Quit successfully

Jan 07 10:41:18.260113 user kernel: NVRM: GPU at PCI:0000:01:00: GPU-f6e87c7a-c83b-f917-3ac5-b744f8e02bca
Jan 07 10:41:18.276856 user kernel: NVRM: GPU Board Serial Number: 1320223061073
Jan 07 10:41:18.276882 user kernel: NVRM: Xid (PCI:0000:01:00): 56, CMDre 00000000 00002888 0001009e 00000007 00000000

This issue only started after upgrading to Fedora 43. Prior to the release, I used x11. I’m now using Wayland because x11 doesn’t seem to be supported anymore. That’s the only change besides the upgrade itself.

         .';:cccccccccccc:;,.             -------------
      .;cccccccccccccccccccccc;.          OS: Fedora Linux 43 (Workstation Edit4
    .:cccccccccccccccccccccccccc:.        Host: X670E Taichi
  .;ccccccccccccc;.:dddl:.;ccccccc;.      Kernel: Linux 6.17.12-300.fc43.x86_64
 .:ccccccccccccc;OWMKOOXMWd;ccccccc:.     Uptime: 31 mins
.:ccccccccccccc;KMMc;cc;xMMc;ccccccc:.    Packages: 3077 (rpm), 31 (flatpak)
,cccccccccccccc;MMM.;cc;;WW:;cccccccc,    Shell: zsh 5.9
:cccccccccccccc;MMM.;cccccccccccccccc:    Display (XG2703-GS): 2560x1440 @ 120 ]
:ccccccc;oxOOOo;MMM000k.;cccccccccccc:    Display (LEN P27h-10): 1440x2560 @ 60]
cccccc;0MMKxdd:;MMMkddc.;cccccccccccc;    DE: GNOME 49.2
ccccc;XMO';cccc;MMM.;cccccccccccccccc'    WM: Mutter (Wayland)
ccccc;MMo;ccccc;MMW.;ccccccccccccccc;     WM Theme: 
ccccc;0MNc.ccc.xMMd;ccccccccccccccc;      Theme: 
cccccc;dNMWXXXWM0:;cccccccccccccc:,       Icons: 
cccccccc;.:odl:.;cccccccccccccc:,.        Font: 
ccccccccccccccccccccccccccccc:'.          Cursor: 
:ccccccccccccccccccccccc:;,..             Terminal: GNOME Terminal 3.56.3
 ':cccccccccccccccc::;,.                  Terminal Font: 
                                          CPU: AMD Ryzen 9 7950X3D (32) @ 5.76 z
                                          GPU 1: NVIDIA GeForce RTX 4090 [Discr]
                                          GPU 2: AMD Raphael [Integrated]
                                          Memory: 9.67 GiB / 61.91 GiB (16%)
                                          Swap: 0 B / 8.00 GiB (0%)
                                          Disk (/): 92.86 GiB / 1.82 TiB (5%) -s
                                          Disk (/home/user/Storage): 1.09 TiB4
                                          Local IP (enp77s0): 192.168.0.50/24
                                          Locale: en_US.UTF-8

Because it happened mostly while you were away, it sounds like a potential power saving issue. Try disabling the nvidia-powerd daemon to see if it makes a difference.

Then if it doesn’t fix the problem, disable the NVIDIA GPU and use just the integrated AMD GPU to help rule out a NVIDIA driver issue.

Thanks for the suggestion. I disabled it. If my computer hasn’t crashed by morning, I’ll be optimistic.