GPU crashed after resume?

Hello.

I’ve been having suspend/resume issues on KDE and I’ve moved to Cinnamon to clear from QT DEs. So far, I haven’t run into a issue with suspending my machine, until now.

Today it suspended without any issues, but after resume I had no signal. I had no ability to go to a TTY. Unfortunately, I had to hard reset.

I looked at the Journalctl logs and it seems that my GPU crashed on resume? I don’t know, there are a lot of AMDGPU related errors.

Logs: openSUSE Paste

1 Like

I suggest you raise a bug against the kernel in the fedora bug tracker.
See How to file a bug :: Fedora Docs

1 Like

SInce you’re here: do you think I should also post the KDE suspend issue there too? I’ve posted on KDE’s bug tracker but there wasn’t any response for well over a week.

In any case, I’ve sent the bug report.

What kernel are you using? Is this a laptop?

Can you paste the output of inxi -Fxzz here in </> preformatted text?

What kernel are you using? Is this a laptop?

  • Linux Kernel 6.8.8-300.fc40.x86_64

  • This is a Desktop computer. It’s a tower and a monitor. When I suspend, the tower’s status light starts blinking.

  • Output of inxi -Fxzz:

roguefort@fedora:~$ inxi -Fxzz

System:
  Kernel: 6.8.8-300.fc40.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.41-34.fc40
  Desktop: Cinnamon v: 6.0.4 Distro: Fedora Linux 40 (Cinnamon)
Machine:
  Type: Desktop Mobo: Micro-Star model: MAG B550 TOMAHAWK MAX WIFI (MS-7C91)
    v: 1.0 serial: <superuser required> UEFI: American Megatrends LLC. v: 2.60
    date: 10/10/2023
CPU:
  Info: 8-core model: AMD Ryzen 7 5700X bits: 64 type: MT MCP arch: Zen 3+
    rev: 2 cache: L1: 512 KiB L2: 4 MiB L3: 32 MiB
  Speed (MHz): avg: 2617 high: 3598 min/max: 2200/4662 boost: enabled cores:
    1: 2200 2: 3598 3: 2200 4: 2200 5: 2872 6: 2200 7: 2879 8: 2875 9: 2200
    10: 2200 11: 2200 12: 2880 13: 2200 14: 2879 15: 2880 16: 3414
    bogomips: 108798
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
    vendor: Sapphire driver: amdgpu v: kernel arch: RDNA-2 bus-ID: 2d:00.0
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 23.2.6 driver: X:
    loaded: amdgpu unloaded: fbdev,modesetting,radeon,vesa dri: radeonsi
    gpu: amdgpu resolution: 1920x1080
  API: OpenGL v: 4.6 vendor: amd mesa v: 24.0.6 glx-v: 1.4
    direct-render: yes renderer: AMD Radeon RX 6700 XT (radeonsi navi22 LLVM
    18.1.1 DRM 3.57 6.8.8-300.fc40.x86_64)
  API: Vulkan v: 1.3.280 drivers: N/A surfaces: xcb,xlib devices: 2
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
Audio:
  Device-1: AMD Navi 21/23 HDMI/DP Audio driver: snd_hda_intel v: kernel
    bus-ID: 2d:00.1
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel bus-ID: 2f:00.4
  API: ALSA v: k6.8.8-300.fc40.x86_64 status: kernel-api
  Server-1: JACK v: 1.9.22 status: off
  Server-2: PipeWire v: 1.0.5 status: active
Network:
  Device-1: MEDIATEK MT7921K Wi-Fi 6E 80MHz driver: mt7921e v: kernel
    bus-ID: 29:00.0
  IF: wlo1 state: down mac: <filter>
  Device-2: Realtek RTL8125 2.5GbE vendor: Micro-Star MSI driver: r8169
    v: kernel port: f000 bus-ID: 2a:00.0
  IF: enp42s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: MediaTek Wireless_Device driver: btusb v: 0.8 type: USB
    bus-ID: 1-9:6
  Report: btmgmt ID: hci0 rfk-id: 0 state: down bt-service: enabled,running
    rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
Drives:
  Local Storage: total: 5.46 TiB used: 852.35 GiB (15.3%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 980 PRO 1TB size: 931.51 GiB
    temp: 39.9 C
  ID-2: /dev/nvme1n1 vendor: Western Digital model: WD Blue SN570 1TB
    size: 931.51 GiB temp: 33.9 C
  ID-3: /dev/sda vendor: Seagate model: ST2000DM008-2UB102 size: 1.82 TiB
  ID-4: /dev/sdb vendor: Crucial model: CT2000MX500SSD1 size: 1.82 TiB
Partition:
  ID-1: / size: 929.93 GiB used: 86.51 GiB (9.3%) fs: btrfs
    dev: /dev/nvme0n1p3
  ID-2: /boot size: 973.4 MiB used: 410.3 MiB (42.1%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 19 MiB (3.2%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 929.93 GiB used: 86.51 GiB (9.3%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: zram size: 7.67 GiB used: 445.8 MiB (5.7%)
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 43.1 C mobo: 38.0 C gpu: amdgpu temp: 44.0 C
  Fan Speeds (rpm): N/A gpu: amdgpu fan: 0
Info: 
  Memory: total: 8 GiB available: 7.67 GiB used: 3.47 GiB (45.3%)
  Processes: 441 Uptime: 1h 29m Init: systemd target: graphical (5)
  Packages: 27 Compilers: gcc: 14.0.1 Shell: Bash v: 5.2.26 inxi: 3.3.34

Can you try kernel 6.8.7-300.fc40 :thinking:

Can you try kernel 6.8.7-300.fc40 :thinking:

I’ve attempted to suspend/resume using the current kernel and kernel 6.8.7-300.fc40 with a variety of applications open:

  • Librewolf with 3-4 tabs;
  • Steam
  • A Youtube video paused
  • A music player (AIMP on WIne) with a song paused
  • Discord

I’ve encountered no issues with both kernels. Must be hard to reproduce, though I’ve only suspended/resumed once on each kernel.

The bug happened when I left it suspended for around 30 minutes to an hour. Not sure if this helps, but anything can help figure this out.

Journalctl logs (for some reason it didn’t recorded the logs for kernel 6.8.7-300.fc40 :expressionless:): openSUSE Paste