Hello,
For the past few days, my Fedora 44 has been freezing occasionally and then displaying a black screen. Nothing responds anymore. This forces me to hard reboot the computer.
Here is a part of the output of the command inxi -Fzxx :
System:
Kernel: 7.0.4-200.fc44.x86_64 arch: x86_64 bits: 64 compiler: gcc v: 16.1.1
Desktop: GNOME v: 50.1 tk: GTK v: 3.24.52 wm: gnome-shell dm: 1: GDM
2: LightDM note: stopped Distro: Fedora Linux 44 (Workstation Edition)
Machine:
Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
Mobo: ASUSTeK model: TUF GAMING B550-PLUS WIFI II v: Rev X.0x
serial: <superuser required> part-nu: SKU Firmware: UEFI
vendor: American Megatrends v: 3405 date: 12/13/2023
CPU:
Info: 16-core model: AMD Ryzen 9 5950X bits: 64 type: MT MCP arch: Zen 3+
rev: 2 cache: L1: 1024 KiB L2: 8 MiB L3: 64 MiB
Speed (MHz): avg: 1746 min/max: 582/5086 boost: enabled cores: 1: 1746
2: 1746 3: 1746 4: 1746 5: 1746 6: 1746 7: 1746 8: 1746 9: 1746 10: 1746
11: 1746 12: 1746 13: 1746 14: 1746 15: 1746 16: 1746 17: 1746 18: 1746
19: 1746 20: 1746 21: 1746 22: 1746 23: 1746 24: 1746 25: 1746 26: 1746
27: 1746 28: 1746 29: 1746 30: 1746 31: 1746 32: 1746 bogomips: 217182
Flags-basic: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a
ssse3 svm
Graphics:
Device-1: NVIDIA AD104 [GeForce RTX 4070 SUPER] vendor: CardExpert
driver: nvidia v: 595.71.05 arch: Lovelace pcie: speed: 2.5 GT/s lanes: 16
ports: active: HDMI-A-1 empty: DP-1,DP-2,DP-3 bus-ID: 07:00.0
chip-ID: 10de:2783
Display: wayland server: X.org v: 1.21.1.22 with: Xwayland v: 24.1.11
compositor: gnome-shell driver: X: loaded: nvidia unloaded: modesetting
alternate: fbdev,nouveau,nv,vesa gpu: nv_platform,nvidia,nvidia-nvswitch
display-ID: 0
Monitor-1: HDMI-A-1 model: MBU27 res: 3840x2160 dpi: 163 diag: 685mm (27")
API: OpenGL v: 4.6.0 vendor: nvidia v: 595.71.05 glx-v: 1.4
direct-render: yes renderer: NVIDIA GeForce RTX 4070 SUPER/PCIe/SSE2
display-ID: :0.0
API: Vulkan v: 1.4.341 surfaces: N/A device: 0 type: discrete-gpu
driver: nvidia device-ID: 10de:2783 device: 1 type: cpu
driver: mesa llvmpipe device-ID: 10005:0000
API: EGL Message: EGL data requires eglinfo. Check --recommends.
Info: Tools: api: glxinfo,vulkaninfo gpu: nvidia-settings x11: xdriinfo,
xdpyinfo, xprop, xrandr
Info:
Memory: total: 64 GiB note: est. available: 62.66 GiB used: 6.31 GiB (10.1%)
Here is an excerpt of the command journalctl -b -1 at the moment the crash occurs:
mai 10 14:13:35 fedora kernel: NVRM: GPU at PCI:0000:07:00: GPU-20a73d3d-c881-b6e0-3de3-aa3ab417fb9e
mai 10 14:13:35 fedora kernel: NVRM: Xid (PCI:0000:07:00): 62, 323f0f30 00006a80 00000000 20315e48 20314ad2 20314c30 2031338c 203139f4
mai 10 14:13:35 fedora kernel: NVRM: GPU0 _kgspRpcGspEventPmuHalted: Received signal from GSP that PMU has halted.
mai 10 14:13:35 fedora kernel: NVRM: Xid (PCI:0000:07:00): 154, GPU recovery action changed from 0x0 (None) to 0x1 (GPU Reset Required)
mai 10 14:13:39 fedora touchegg.desktop[5290]: Error connecting to Touchégg daemon: Could not connect: Connection refused
mai 10 14:13:39 fedora touchegg.desktop[5290]: Reconnecting in 5 seconds...
mai 10 14:13:44 fedora touchegg.desktop[5290]: Error connecting to Touchégg daemon: Could not connect: Connection refused
mai 10 14:13:44 fedora touchegg.desktop[5290]: Reconnecting in 5 seconds...
mai 10 14:13:48 fedora kernel: NVRM: krcWatchdog_IMPL: RC watchdog: GPU is probably locked! Notify Timeout Seconds: 7
mai 10 14:13:49 fedora touchegg.desktop[5290]: Error connecting to Touchégg daemon: Could not connect: Connection refused
mai 10 14:13:49 fedora touchegg.desktop[5290]: Reconnecting in 5 seconds...
mai 10 14:13:51 fedora at-spi2-registryd[5133]: Disabling unresponsive app with pid 5066
mai 10 14:13:51 fedora kernel: NVRM: GPU0 nvAssertFailedNoLog: Assertion failed: (status == NV_OK) || (status == NV_ERR_GPU_IN_FULLCHIP_RESET) @ rs_client.c:844
mai 10 14:13:51 fedora kernel: NVRM: nvAssertFailedNoLog: Assertion failed: (status == NV_OK) || (status == NV_ERR_GPU_IN_FULLCHIP_RESET) @ rs_server.c:259
mai 10 14:13:51 fedora kernel: NVRM: nvAssertFailedNoLog: Assertion failed: (status == NV_OK) || (status == NV_ERR_GPU_IN_FULLCHIP_RESET) @ rs_server.c:1375
...
Here is my /etc/default/grub file :
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="nvidia-drm.modeset=1 rhgb quiet rd.driver.blacklist=nouveau,nova_core modprobe.blacklist=nouveau,nova_core"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
How do I fix this? Thank you.