Desktop Freezes and requires hard reboot - F41 Plasma NVIDIA/AMD

Recently freshly installed fedora 41 on asus zephyrus g16 amd laptop and almost daily will hit a point where the screen and everything freezes and I have to do a hard reset.

sudo journalctl -k | fpaste --raw-url

https://paste.centos.org/view/raw/8b558b1c

inxi -Fzxx
System:
  Kernel: 6.12.11-200.fc41.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 14.2.1
  Desktop: KDE Plasma v: 6.3.0 tk: Qt v: N/A wm: kwin_wayland dm: SDDM
    Distro: Fedora Linux 41 (KDE Plasma)
Machine:
  Type: Laptop System: ASUSTeK product: ROG Zephyrus G16 GA605WI_GA605WI
    v: 1.0 serial: <superuser required>
  Mobo: ASUSTeK model: GA605WI v: 1.0 serial: <superuser required>
    UEFI: American Megatrends LLC. v: GA605WI.306 date: 08/01/2024
Battery:
  ID-1: BAT1 charge: 68.5 Wh (81.5%) condition: 84.1/89.8 Wh (93.6%)
    volts: 15.7 min: 15.9 model: ASUS A32-K55 serial: N/A status: discharging
CPU:
  Info: 12-core model: AMD Ryzen AI 9 HX 370 w/ Radeon 890M bits: 64
    type: MT MCP arch: N/A rev: 0 cache: L1: 960 KiB L2: 12 MiB L3: 24 MiB
  Speed (MHz): avg: 599 min/max: 599/4367 boost: enabled cores: 1: 599
    2: 599 3: 599 4: 599 5: 599 6: 599 7: 599 8: 599 9: 599 10: 599 11: 599
    12: 599 13: 599 14: 599 15: 599 16: 599 17: 599 18: 599 19: 599 20: 599
    21: 599 22: 599 23: 599 24: 599 bogomips: 95823
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: NVIDIA AD106M [GeForce RTX 4070 Max-Q / Mobile] vendor: ASUSTeK
    driver: nvidia v: 565.77 arch: Lovelace pcie: speed: 2.5 GT/s lanes: 8
    ports: active: none off: HDMI-A-1 empty: DP-9,eDP-2 bus-ID: 64:00.0
    chip-ID: 10de:2860
  Device-2: Advanced Micro Devices [AMD/ATI] Strix [Radeon 880M / 890M]
    vendor: ASUSTeK driver: amdgpu v: kernel pcie: speed: 16 GT/s lanes: 16
    ports: active: eDP-1 empty: DP-1, DP-2, DP-3, DP-4, DP-5, DP-6, DP-7,
    DP-8, Writeback-1 bus-ID: 65:00.0 chip-ID: 1002:150e temp: 38.0 C
  Device-3: Shinetech USB2.0 FHD UVC WebCam driver: uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 1-1:2 chip-ID: 3277:0051
  Display: wayland server: Xwayland v: 24.1.5 compositor: kwin_wayland
    driver: gpu: amdgpu,nvidia,nvidia-nvswitch d-rect: 4480x1600 display-ID: 0
  Monitor-1: HDMI-A-1 pos: primary,left model: LG (GoldStar) FULL HD
    res: 1920x1080 hz: 75 dpi: 102 diag: 551mm (21.7")
  Monitor-2: eDP-1 pos: right model: Samsung ATNA60DL04-0 res: 2560x1600
    hz: 240 dpi: 191 diag: 405mm (15.94")
  API: EGL v: 1.5 platforms: device: 0 drv: nvidia device: 1 drv: radeonsi
    gbm: drv: kms_swrast surfaceless: drv: nvidia wayland: drv: radeonsi x11:
    drv: radeonsi
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: amd mesa v: 24.3.4 glx-v: 1.4
    direct-render: yes renderer: AMD Radeon Graphics (radeonsi gfx1150 LLVM
    19.1.0 DRM 3.59 6.12.11-200.fc41.x86_64) device-ID: 1002:150e
    display-ID: :0.0
  API: Vulkan v: 1.4.304 surfaces: xcb,xlib,wayland device: 0
    type: integrated-gpu driver: N/A device-ID: 1002:150e device: 1
    type: discrete-gpu driver: N/A device-ID: 10de:2860 device: 2 type: cpu
    driver: N/A device-ID: 10005:0000
  Info: Tools: api: clinfo, eglinfo, glxinfo, vulkaninfo
    de: kscreen-console,kscreen-doctor gpu: nvidia-settings,nvidia-smi
    wl: wayland-info x11: xdriinfo, xdpyinfo, xprop, xrandr
Audio:
  Device-1: NVIDIA AD106M High Definition Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s lanes: 8
    bus-ID: 64:00.1 chip-ID: 10de:22bd
  Device-2: Advanced Micro Devices [AMD/ATI] Rembrandt Radeon High
    Definition Audio driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s
    lanes: 16 bus-ID: 65:00.1 chip-ID: 1002:1640
  Device-3: Advanced Micro Devices [AMD] ACP/ACP3X/ACP6x Audio Coprocessor
    driver: snd_acp_pci v: kernel pcie: speed: 16 GT/s lanes: 16 bus-ID: 65:00.5
    chip-ID: 1022:15e2
  Device-4: Advanced Micro Devices [AMD] Family 17h/19h/1ah HD Audio
    vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s
    lanes: 16 bus-ID: 65:00.6 chip-ID: 1022:15e3
  Device-5: Walmart AB13X Headset Adapter
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 2.0 speed: 12 Mb/s
    lanes: 1 bus-ID: 3-1.2:6 chip-ID: 001f:0b21
  API: ALSA v: k6.12.11-200.fc41.x86_64 status: kernel-api
  Server-1: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin
Network:
  Device-1: MEDIATEK vendor: Foxconn driver: mt7925e v: kernel pcie:
    speed: 5 GT/s lanes: 1 port: N/A bus-ID: 63:00.0 chip-ID: 14c3:7925
  IF: wlp99s0 state: up mac: <filter>
Bluetooth:
  Device-1: Foxconn / Hon Hai Wireless_Device driver: btusb v: 0.8 type: USB
    rev: 2.1 speed: 480 Mb/s lanes: 1 bus-ID: 3-3:3 chip-ID: 0489:e11e
  Report: btmgmt ID: hci0 rfk-id: 0 state: up address: <filter> bt-v: 5.4
    lmp-v: 13
Drives:
  Local Storage: total: 1.86 TiB used: 36.07 GiB (1.9%)
  ID-1: /dev/nvme0n1 vendor: Micron model: MTFDKBA2T0QFM-1BD1AABGB
    size: 1.86 TiB speed: 63.2 Gb/s lanes: 4 serial: <filter> temp: 25.9 C
Partition:
  ID-1: / size: 1.85 TiB used: 35.46 GiB (1.9%) fs: btrfs dev: /dev/nvme0n1p3
  ID-2: /boot size: 7.78 GiB used: 604.7 MiB (7.6%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 2 GiB used: 19.3 MiB (0.9%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 1.85 TiB used: 35.46 GiB (1.9%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 46.9 C mobo: N/A
  Fan Speeds (rpm): cpu: 2300
Info:
  Memory: total: 32 GiB note: est. available: 30.46 GiB used: 6.05 GiB (19.9%)
  Processes: 1210 Power: uptime: 12m wakeups: 0 Init: systemd v: 256
    target: graphical (5) default: graphical
  Packages: pm: rpm pkgs: N/A note: see --rpm pm: flatpak pkgs: 24
    Compilers: gcc: 14.2.1 Shell: Bash v: 5.2.32 running-in: konsole
    inxi: 3.3.37

tail end of dmesg entries

[   28.127226] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[   28.243636] systemd-journald[897]: File /var/log/journal/b97d1342c60445eb935c0d09fbf9eb58/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
[   69.517540] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  153.011620] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  162.013245] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  176.514605] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  338.458935] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  577.877193] amdgpu 0000:65:00.0: [drm] REG_WAIT timeout 1us * 10 tries - optc3_lock line:128
[  613.977539] show_signal_msg: 112 callbacks suppressed
[  613.977546] eglinfo[6892]: segfault at 8 ip 00007f2732628fcf sp 00007fff62f01c40 error 4 in libgallium-24.3.4.so[28fcf,7f2732601000+1a49000] likely on CPU 22 (core 14, socket 0)
[  613.977561] Code: 48 89 e5 41 56 41 55 41 54 41 89 f4 53 4c 8b 6f 58 48 89 fb 0f b6 05 01 9a a3 02 84 c0 0f 84 98 00 00 00 0f b6 05 f1 99 a3 02 <41> 83 7d 08 03 4c 8d 35 05 df 90 02 88 83 68 01 00 00 7e 0e 49 83
[  676.590932] apple-mfi-fastcharge 7-1: USB disconnect, device number 2
[  676.616261] ipheth 7-1:4.2: Apple iPhone USB Ethernet now disconnected
[  740.145322] eglinfo[7488]: segfault at 8 ip 00007ff151428fcf sp 00007ffeb1c36960 error 4 in libgallium-24.3.4.so[28fcf,7ff151401000+1a49000] likely on CPU 17 (core 9, socket 0)
[  740.145348] Code: 48 89 e5 41 56 41 55 41 54 41 89 f4 53 4c 8b 6f 58 48 89 fb 0f b6 05 01 9a a3 02 84 c0 0f 84 98 00 00 00 0f b6 05 f1 99 a3 02 <41> 83 7d 08 03 4c 8d 35 05 df 90 02 88 83 68 01 00 00 7e 0e 49 83

I have a very similarly spec’d MSI laptop (Ryzen 9 with an nVidia 4060) and it’s definitely the nvidia driver crashing that is causing it for me. Similarly, it’s an almost daily reboot.

It’s interesting since it wasn’t happening before on the Asus ROG kernel. That said I was on f40 then. It’s a real pain.

Hi All,

I’m 99.99% sure that this is because of the xorg-x11-server-Xwayland 24.1.5 update.
After updating, wine applications broke for me with memory errors and my desktop froze.
Tried multiple reboots, got the same results, thus I did a downgrade:

sudo dnf downgrade xorg-x11-server-Xwayland

After that & rebooting, everything works fine now.
The package is on 24.1.3 version now here, no issues at all.

Take care!

Spec: Fedora41+XFCE, Kernel 6.12.13, GF1050Ti drv 565.77

I’ve been having similar issues on my desktop with an RTX 4070. Running F41 KDE with drivers 565.77. At first I only notice that the computer would refuse to turn the monitor on from idle (just turned the display off, not sleep). I would have to hard reboot.

Then I started noticing it during just general use/browsing. It would start to stutter, my mouse cursor would hang and music would skip. Then it would get worse and worse until it was completely frozen and unresponsive and require a hard reboot again.

Nothing I’ve done has seemed to make any difference yet. I will try downgrading xorg-x11-server-Xwayland and see if that helps though.

Few weeks back - this was before the Xwayland package update - it was the xfwm4 process that eat up all of my memory. This was a one time thing, never experienced it before, nor after. I killed it, rebooted and had no issues until the 24.1.5 update. Now it uses less than 1% of the memory. As you are on KDE, I don’t think you will see this one, but there is a bug report / investigation for this for XFCE.

It might be related. It definitely FEELS like a memory leak; the one or two times I’ve encountered a misbehaving application leaking memory like crazy, the behavior was very similar.

I did open System Monitor and watched while the bug happened, and I saw RAM usage tick over from 10.7gb to 11gb (I was running a game to try to ā€˜stress’ test it) when it happened, but that’s far from maxing out my 32gb of RAM.

I did also come across another post on this forum saying it was related to the xwayland package and recommended downgrading, but I did this and it didn’t fix it.

I wonder if it’s running out of VRAM?

I updated xorg-x11-server-Xwayland to 24.1.5 again and watched the system for several days and everything seems to be normal now. Hm, my issue may have been a random anomaly and the package was not the culprit here.

The problem is I’m not seeing any really interesting journal entries, core dumps etc so how do we start isolating the issue?

Mine turned out not to be software related at all. It was a bad NVME drive. Ironically not the one my system was on, and in fact it’s one I don’t normally even mount. Just it physically being installed in my system would cause the system to stutter and freeze. I wasn’t able to narrow it down to a hardware issue until trying a live USB and still getting the problem. That’s why I wasn’t really finding anything relevant in dmesg or journalctl either.

I sent mine out and it just came back with a ā€œcleanā€ bill of health from whoever services the laptops for Asus. I’m going to have to re-install now but I’ll give the live USB method a try.

1 Like

@gbouras is it working now? I have a similar NB and got the same freezing with Fedora 41 (but not from the beginning). Thanks!

So now things seem…ok. I had a huge issue with the latest kwin update where I was getting a black screen on boot and lots of crashes. After that I had this issue where if I full screened a video on youtube on my built in laptop display and moved my mouse to the external then back - it would freeze and I would have to hard reboot. Interestingly enough this did not happen if I did it on the external display. I had AMD error messages after. So…I am running all my graphics on NVIDIA and so far there are no issues. Other than I can’t use sway etc.

When using wayland can you try creating folder and file ~/.config/environment.d/90-nvidia.conf
with the content

__NV_PRIME_RENDER_OFFLOAD=1
__GLX_VENDOR_LIBRARY_NAME=nvidia
__VK_LAYER_NV_optimus=NVIDIA_only

1 Like