Random freezes on fedora 36 with AMD GPU

,

Happened again. I tried booting from a different kernel (specifically, kernel-5.18.7-200.fc36.x86_64 vs kernel-5.18.11-200.fc36.x86_64), and samey samey. Froze right up. I do have a journalctl output from an earlier boot (using the 5.18.11-200 kernel).

Switching to a different TTY did not work - Ctrl + Alt + F3 (or indeed, any F-key) was not responsive.

I have been dealing with a similar problem on F36 with AMD integrated graphics. It has also lead to crashes… throwing me out to the login screen. I am lucky to have not lost any work :sweat_smile:

The types of freezes have mostly been display, Sound continues to play and it seems inputs are taken.
I have not tried to ssh as I don’t have anything to ssh with but at least to my eyes it feels like display.
Then there are crashes where I am thrown out to the login screen. I think the shell had crashed but I do not know how to check that/

In any case here is the output of inxi -Fzx

System:
  Kernel: 5.18.11-200.fc36.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.37-27.fc36 Desktop: GNOME v: 42.3.1
    Distro: Fedora release 36 (Thirty Six)
Machine:
  Type: Laptop System: Acer product: Aspire A515-43 v: V1.05
    serial: <superuser required>
  Mobo: PK model: Grumpy_PK v: V1.05 serial: <superuser required>
    UEFI: Insyde v: 1.05 date: 06/26/2019
Battery:
  ID-1: BAT1 charge: 20.5 Wh (100.0%) condition: 20.5/47.9 Wh (42.9%)
    volts: 11.6 min: 11.4 model: Murata 0x41,0x50,0x31,0x38,0x43,0x34,0x0001
    status: full
  Device-1: hidpp_battery_0 model: Logitech M570 charge: 30%
    status: discharging
CPU:
  Info: quad core model: AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
    bits: 64 type: MT MCP arch: Zen/Zen+ note: check rev: 1 cache: L1: 384 KiB
    L2: 2 MiB L3: 4 MiB
  Speed (MHz): avg: 1488 high: 2727 min/max: 1400/2100 boost: enabled
    cores: 1: 1231 2: 1231 3: 1281 4: 1256 5: 1523 6: 2727 7: 1388 8: 1274
    bogomips: 33536
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Picasso/Raven 2 [Radeon Vega Series / Radeon Mobile Series]
    vendor: Acer Incorporated ALI driver: amdgpu v: kernel arch: GCN 5
    bus-ID: 05:00.0
  Device-2: Quanta HD Webcam type: USB driver: uvcvideo bus-ID: 1-1:2
  Display: wayland server: X.Org v: 1.22.1.3 with: Xwayland v: 22.1.3
    compositor: gnome-shell driver: X: loaded: amdgpu
    unloaded: fbdev,modesetting,vesa gpu: amdgpu resolution: 1920x1080~60Hz
  OpenGL: renderer: AMD Radeon Vega 8 Graphics (raven LLVM 14.0.0 DRM 3.46
  5.18.11-200.fc36.x86_64)
    v: 4.6 Mesa 22.1.3 direct render: Yes
Audio:
  Device-1: AMD Raven/Raven2/Fenghuang HDMI/DP Audio
    vendor: Acer Incorporated ALI driver: snd_hda_intel v: kernel
    bus-ID: 05:00.1
  Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor
    vendor: Acer Incorporated ALI driver: snd_pci_acp3x v: kernel
    bus-ID: 05:00.5
  Device-3: AMD Family 17h/19h HD Audio vendor: Acer Incorporated ALI
    driver: snd_hda_intel v: kernel bus-ID: 05:00.6
  Sound Server-1: ALSA v: k5.18.11-200.fc36.x86_64 running: yes
  Sound Server-2: PulseAudio v: 15.0 running: no
  Sound Server-3: PipeWire v: 0.3.55 running: yes
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Acer Incorporated ALI driver: r8169 v: kernel port: 2000
    bus-ID: 03:00.0
  IF: enp3s0 state: down mac: <filter>
  Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
    vendor: Lite-On driver: ath10k_pci v: kernel bus-ID: 04:00.0
  IF: wlp4s0 state: up mac: <filter>
  IF-ID-1: gpd0 state: down mac: N/A
  IF-ID-2: virbr0 state: down mac: <filter>
Bluetooth:
  Device-1: Lite-On type: USB driver: btusb v: 0.8 bus-ID: 1-4:4
  Report: rfkill ID: hci0 rfk-id: 2 state: up address: see --recommends
Drives:
  Local Storage: total: 238.47 GiB used: 125.43 GiB (52.6%)
  ID-1: /dev/nvme0n1 vendor: SK Hynix model: HFM256GDJTNG-8310A
    size: 238.47 GiB temp: 43.9 C
Partition:
  ID-1: / size: 236.89 GiB used: 125.14 GiB (52.8%) fs: btrfs
    dev: /dev/nvme0n1p3
  ID-2: /boot size: 973.4 MiB used: 284.4 MiB (29.2%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 14 MiB (2.3%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 236.89 GiB used: 125.14 GiB (52.8%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: zram size: 5.62 GiB used: 2.86 GiB (50.9%)
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: N/A mobo: N/A gpu: amdgpu temp: 75.0 C
  Fan Speeds (RPM): N/A
Info:
  Processes: 351 Uptime: 22m Memory: 5.62 GiB used: 4.6 GiB (81.9%)
  Init: systemd target: graphical (5) Compilers: gcc: 12.1.1 Packages: 5474
  Shell: Zsh v: 5.8.1 inxi: 3.3.19

I was hoping that a system update would make it go away but sadly not. in any case I hope this is helpful and if you need more I can provide.

Have you tried logging in with xorg instead of wayland to see if that gives a different response. If it is truly just graphics (mouse + keyboard) that are not responding then it may be nothing more than a wayland vs xorg issue.

That can be done by selecting ‘gnome with xorg’ with the gear icon in the lower right corner when entering your password.

A cursory look and it seems it still persists on xorg, I am getting worse freezes though on xorg my mouse is free to move around (though the rest of the shell and apps are unresponsive) in this instance wayland actually is better as the freezes are usually shorter.

I can do some more testing if need be (though there may be a gap in my response time)

I have not tried that, I will try that next time - but if it was just Wayland tanking the system, wouldn’t I be able to get into a terminal TTY via Ctrl + Alt + F3?

I think this might be the same issue as reported here:

https://discussion.fedoraproject.org/t/where-to-report-amd-gpu-lockups-crasher-hangs-bugs-related-to-seemingly-triggered-by-firefox-va-api-after-resuming-from-suspend/69378

Seems like it might be the bugs mentioned in that thread.

1 Like

Yes and No.

Yes if the keyboard is functional, but No if the keyboard is not responding.

that DOES look like the issue I’m having, except that one of his conditions is explicitly “The system must have been suspended (put to sleep) once, then resumed” - and I have DEFINITELY gotten severe system lockups anywhere from one to ten minutes after a complete reboot (Press and hold the power button until the system shuts off, wait five to ten seconds, press the power button again to turn the system back on).

However, the error messages in his logs are incontrovertibly similar to the ones I see in mine…

You might try switching to amdgpu driver, follow undermentiond post, just use these kernel parameters for your Southern Islands (SI) GPU (GCN 1.0): radeon.si_support=0 amdgpu.si_support=1:
https://discussion.fedoraproject.org/t/how-to-install-amd-graphics-driver-on-fedora-36/24715/5

As of an update today which put my kernel version on

Linux 5.18.13-200.fc36.x86_64
(uname -s -r) 

It seems to have gone away.
for posterity
here is the output of inxi -Fzx

System:
  Kernel: 5.18.13-200.fc36.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.37-27.fc36 Desktop: GNOME v: 42.3.1
    Distro: Fedora release 36 (Thirty Six)
Machine:
  Type: Laptop System: Acer product: Aspire A515-43 v: V1.05
    serial: <superuser required>
  Mobo: PK model: Grumpy_PK v: V1.05 serial: <superuser required>
    UEFI: Insyde v: 1.05 date: 06/26/2019
Battery:
  ID-1: BAT1 charge: 21.2 Wh (100.0%) condition: 21.2/47.9 Wh (44.4%)
    volts: 11.6 min: 11.4 model: Murata 0x41,0x50,0x31,0x38,0x43,0x34,0x0001
    status: full
  Device-1: hidpp_battery_0 model: Logitech M570 charge: 30%
    status: discharging
CPU:
  Info: quad core model: AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
    bits: 64 type: MT MCP arch: Zen/Zen+ note: check rev: 1 cache: L1: 384 KiB
    L2: 2 MiB L3: 4 MiB
  Speed (MHz): avg: 1565 high: 2767 min/max: 1400/2100 boost: enabled
    cores: 1: 1481 2: 1308 3: 1993 4: 2767 5: 1231 6: 1231 7: 1231 8: 1284
    bogomips: 33533
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: AMD Picasso/Raven 2 [Radeon Vega Series / Radeon Mobile Series]
    vendor: Acer Incorporated ALI driver: amdgpu v: kernel arch: GCN 5
    bus-ID: 05:00.0
  Device-2: Quanta HD Webcam type: USB driver: uvcvideo bus-ID: 1-1:2
  Display: wayland server: X.Org v: 1.22.1.3 with: Xwayland v: 22.1.3
    compositor: gnome-shell driver: X: loaded: amdgpu
    unloaded: fbdev,modesetting,vesa gpu: amdgpu resolution: 1920x1080~60Hz
  OpenGL: renderer: AMD Radeon Vega 8 Graphics (raven LLVM 14.0.0 DRM 3.46
  5.18.13-200.fc36.x86_64)
    v: 4.6 Mesa 22.1.4 direct render: Yes
Audio:
  Device-1: AMD Raven/Raven2/Fenghuang HDMI/DP Audio
    vendor: Acer Incorporated ALI driver: snd_hda_intel v: kernel
    bus-ID: 05:00.1
  Device-2: AMD ACP/ACP3X/ACP6x Audio Coprocessor
    vendor: Acer Incorporated ALI driver: snd_pci_acp3x v: kernel
    bus-ID: 05:00.5
  Device-3: AMD Family 17h/19h HD Audio vendor: Acer Incorporated ALI
    driver: snd_hda_intel v: kernel bus-ID: 05:00.6
  Sound Server-1: ALSA v: k5.18.13-200.fc36.x86_64 running: yes
  Sound Server-2: PulseAudio v: 15.0 running: no
  Sound Server-3: PipeWire v: 0.3.56 running: yes
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Acer Incorporated ALI driver: r8169 v: kernel port: 2000
    bus-ID: 03:00.0
  IF: enp3s0 state: down mac: <filter>
  Device-2: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter
    vendor: Lite-On driver: ath10k_pci v: kernel bus-ID: 04:00.0
  IF: wlp4s0 state: up mac: <filter>
  IF-ID-1: gpd0 state: down mac: N/A
  IF-ID-2: virbr0 state: down mac: <filter>
Bluetooth:
  Device-1: Lite-On type: USB driver: btusb v: 0.8 bus-ID: 1-4:4
  Report: rfkill ID: hci0 rfk-id: 2 state: up address: see --recommends
Drives:
  Local Storage: total: 238.47 GiB used: 125.69 GiB (52.7%)
  ID-1: /dev/nvme0n1 vendor: SK Hynix model: HFM256GDJTNG-8310A
    size: 238.47 GiB temp: 40.9 C
Partition:
  ID-1: / size: 236.89 GiB used: 125.4 GiB (52.9%) fs: btrfs
    dev: /dev/nvme0n1p3
  ID-2: /boot size: 973.4 MiB used: 285.9 MiB (29.4%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 14 MiB (2.3%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 236.89 GiB used: 125.4 GiB (52.9%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: zram size: 5.62 GiB used: 706.8 MiB (12.3%)
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: N/A mobo: N/A gpu: amdgpu temp: 66.0 C
  Fan Speeds (RPM): N/A
Info:
  Processes: 380 Uptime: 15m Memory: 5.62 GiB used: 4.24 GiB (75.4%)
  Init: systemd target: graphical (5) Compilers: gcc: 12.1.1 Packages: 5474
  Shell: Zsh v: 5.8.1 inxi: 3.3.19
Thanks for your help, if you need more info let me know

I have faced a very similar issue with Ryzen 5 + integrated radeon vega 8 . It is generally worse on Xorg than Wayland. I found an workaround that prevents the random lock-ups. Install Corectrl and set cpu and gpu governor to powersave.

I thought that for myself, but it still happens. Seems like it happens much, much more rarely now, though… I’m at six days now! O_O

that’s defintely better than anything I’d get to earlier…

I have not experienced anything in this regard, and when I have its usually from something else. I am just thankful that it has seemed to have gone away and the risk of me losing work has gone down drastically.

Regards,

Jeetaditya Chatterjee
Sent using my text editor

I actually had plenty of freezes between my last comment and this one, even after multiple kernel updates - so I decided to give this a try after one of my journalctl -k -b -1 commands FINALLY revealed that the GPU was seizing up, so I tried that command… but it does not appear to have taken. I just rebooted and it shows the radeon driver is still loaded, even after doing that little grubby command. :frowning:

If glxinfo | grep DRM returns 2.50 after DRM part - you’re using radeon driver - and if 3.47, then amdgpu one.

1 Like

I’m using the radeon driver. :confused:

Both per that command, as well as inxi -Fzx. I tried running the grubby command again, I’ll try another reboot and I’ll check my grub kernel options before I pick one to see if I see those options in the line. :stuck_out_tongue:

cat /proc/cmdline

BOOT_IMAGE=(hd7,gpt4)/vmlinuz-5.19.8-200.fc36.x86_64 root=/dev/mapper/fedora_localhost--live-root ro resume=/dev/mapper/fedora_localhost--live-swap rd.lvm.lv=fedora_localhost-live/root rd.luks.uuid=luks-a9ef7a03-5019-4d6a-beba-d98d832e5d4c rd.lvm.lv=fedora_localhost-live/swap rhgb quiet radeon.cik_support=0 amdgpu.cik_support=1

I also double-checked this on my last boot, and yet… inxi -Fzx:

Graphics:
  Device-1: AMD Oland XT [Radeon HD 8670 / R5 340X OEM R7 250/350/350X OEM]
    vendor: Dell driver: radeon v: kernel arch: GCN 1 bus-ID: 01:00.0
  Device-2: Logitech C920 PRO HD Webcam type: USB
    driver: snd-usb-audio,uvcvideo bus-ID: 3-5:4
  Display: wayland server: X.Org v: 1.22.1.3 with: Xwayland v: 22.1.3
    compositor: gnome-shell driver: gpu: radeon resolution: 1: 1920x1080~60Hz
    2: 1920x1080~60Hz
  OpenGL: renderer: AMD OLAND (LLVM 14.0.0 DRM 2.50 5.19.8-200.fc36.x86_64)
    v: 4.5 Mesa 22.1.7 direct render: Yes

not sure why it’s not loading with the amdgpu driver. :confused:

Reread that:

1 Like

Alrighty! My bad. :stuck_out_tongue:

I just gave that a shot now that I’m back from vacay, hopefully it works!