System freezes after screen blank power saving AMDGPU

Hello, when the blank screen activates after 5 minutes of inactivity, my system becomes completely unresponsive.
Using Wayland.
CPU: AMD Ryzen 7 2700X
GPU: AMD Radeon RX 6600 XT
Rpm-ostree deployment:

● fedora:fedora/35/x86_64/silverblue
                   Version: 35.20220110.0 (2022-01-10T00:41:57Z)
                BaseCommit: 34a381a2aec26bcb309ae97607d484447248f0ab1d0b41d9d4f8a7fd9aeabddb
              GPGSignature: Valid signature by 787EA6AE1147EEE56C40B30CDB4639719867C58F
           LayeredPackages: ffmpeg openssl vim
             LocalPackages: rpmfusion-free-release-35-1.noarch

Logs:

[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3
[drm:amdgpu_dm_update_freesync_caps [amdgpu]] *ERROR* EDID CEA parser failed
[drm:dcn20_wait_for_blank_complete [amdgpu]] *ERROR* DC: failed to blank crtc!
amdgpu 0000:28:00.0: amdgpu: Failed to disable gfxoff!
watchdog: BUG: soft lockup - CPU#6 stuck for 52s! [gnome-shell:1908]

Thanks for your help!

I believe I had a slightly different problem, but my system also sometimes froze during a blank screen (with Radeon 580). After I disabled power management on my RX580 audio device, my original problem went away and also system freezes seem to have disappeared. Worth giving it a shot. Use powertop, go to Tunables, and try toggling some of those options related to your GPU (in my case, it was Runtime PM for PCI Device Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] set to Bad). It only persists until reboot, but it also prints a command, in case you want to run it on each system boot manually.

You should also file a bug probably against the amdgpu kernel driver, probably here:

Seems like someone already filed a bug report: RX 6600 XT fails to resume from suspend in 5.15.y series (#1819) · Issues · drm / amd · GitLab

Thank you, just used this on my Legion 7 AMD edition, for anyone searching the internet for me the following had to be deactivated (set from “good” to “bad”) in powertop:

Runtime PM for PCI Device Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt Radeon High Definition Audio Controller

Additionally one must set the default graphics to discrete, which will continuously run the dedicated gpu instead of the lower power one, which for some reason fails during suspend.

My system settings from inxi -G

Graphics:
  Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M]
    driver: amdgpu v: kernel
  Device-2: AMD Rembrandt [Radeon 680M] driver: amdgpu v: kernel
  Device-3: Luxvisions Innotech Integrated RGB Camera type: USB
    driver: uvcvideo
  Display: wayland server: X.Org v: 1.22.1.7 with: Xwayland v: 22.1.7
    compositor: gnome-shell v: 42.6 driver: X: loaded: modesetting,nvidia
    unloaded: fbdev,nouveau,vesa dri: radeonsi gpu: amdgpu
    resolution: 2560x1600~60Hz
  API: OpenGL v: 4.6 Mesa 22.1.7 renderer: AMD Radeon Graphics (yellow_carp
    LLVM 14.0.0 DRM 3.49 6.1.5-100.fc36.x86_64)

Please tell us exactly how you set the graphics to use the discrete dGPU continuously. There are many who would like that info in detail.

I note that you have 2 separate Radeon devices and are running wayland.

1 Like

That was very simple, unfortunately it’s only applicable for the Lenovo Legion 7 gen 7 AMD.
I have an option in the bios that forces the dGPU (RX 6700M) to run constantly.

There are two options (I’m typing from memory they might be called slightly differently):

  • Dynamic Switching
  • Discrete Graphics only

I’m actually trying to achieve the exact opposite for power-saving, I’d like for the dGPU to be deactivated.
At this moment my current approaches have not worked. Someone suggested that I could reserve the dGPU for kvm which would keep the kernel from addressing it as described here:

There is an easy trick: Reserve you discrete GPU for KVM passthrough without ever actually passing it to a VM:

    find your PCI vendor and device ID (e.g. 1002:67ef)
    add vfio-pci.ids=1002:67ef to your kernel command line via GRUB (ofcourse using the correct IDs)

This way, on boot Linux will not use the device as a GPU but assigne it the vfio-driver, so neuters it if no passthrough is assigned.

This did not work for me. Further to this I’m experiencing white screen with only the mouse pointer visible when coming out of sleep or running an externel screen at the moment in wayland in conjunction to kernel 6.1; kernel 6.0.15 works without issues. For this I’m trying to find out how to debug wayland so that I can file an appropriate bug report. Not sure if its a kernel issue or a wayland issue that got triggered by the changes done in 6.1. (Probably going to make a new posting, as this is slowly going off topic.)

It would appear that to have the dGPU work in hybrid mode you would need to set the bios to ‘dynamic switching’. Then booting should use the iGPU and the dGPU would only be used when you selected to do so in the OS.

To follow up on this post, this was actually related to an issue in the AMDGPU driver documented here:

I’ve applied the patch and all is working now.