Suspending to RAM fails occassionally on AMD Ryzen

Hi all,

I am a bit struggeling with my Fedora33 on a Thinkpad with Ryzen CPU/GPU [1].

Problem is, that suspending the machine to mem/S3 does only works occassionally. Most of the time, after an initial boot, suspending works; but after waking up any follow-up attempt to suspend tends to fail. In such cases, the display gets switched off, but the system fan keeps running.
A lso the machine can not be woken up from this state but has to shut down by forcing the power button for some time.

My suspicion is, that the integrated GPU might cause the problem.

When manually sending the machine to sleep
echo mem> /sys/power/state
and waking it up again (in cases it works), I find a number of messages like
amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110).
[2]

Another suspicioun might be the NVMe, s I had issues with the disk sometimes freezing and the file system failing, I tried to circumvent it by tuning its power state APTS settings
grubby --update-kernel=ALL --args=nvme_core.default_ps_max_latency_us=5500

However, I have not managed to reliable get my laptop into sleep and are out of ideas hoping for suggestions.

Cheers and thanks for any idea,
Thomas

[1] System Summary

Summary

inxi -Fxz
System: Kernel: 5.11.17-200.fc33.x86_64 x86_64 bits: 64 compiler: gcc v: 2.35-18.fc33 Desktop: KDE Plasma 5.20.5
Distro: Fedora release 33 (Thirty Three)
Machine: Type: Laptop System: LENOVO product: 20UGS00800 v: ThinkPad X13 Gen 1 serial:
Mobo: LENOVO model: 20UGS00800 serial: UEFI: LENOVO v: R1CET56W(1.25 ) date: 09/15/2020
Battery: ID-1: BAT0 charge: 21.7 Wh (44.2%) condition: 49.1/48.0 Wh (102.3%) volts: 11.5 min: 11.5 model: SMP 5B10W139
status: Unknown
CPU: Info: 8-Core model: AMD Ryzen 7 PRO 4750U with Radeon Graphics bits: 64 type: MT MCP arch: Zen 2 rev: 1 cache:
L2: 4 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 54293
Speed: 1397 MHz min/max: 1400/1700 MHz boost: enabled Core speeds (MHz): 1: 1397 2: 1397 3: 1397 4: 1397 5: 1397
6: 1397 7: 1397 8: 1397 9: 1397 10: 1397 11: 1397 12: 1397 13: 1397 14: 1397 15: 1397 16: 1397
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Renoir vendor: Lenovo driver: amdgpu v: kernel bus-ID: 06:00.0
Device-2: IMC Networks Integrated Camera type: USB driver: uvcvideo bus-ID: 2-2:2
Display: x11 server: Fedora Project X.org 1.20.11 driver: loaded: amdgpu,ati unloaded: fbdev,modesetting,vesa
resolution: 1: 1920x1080~60Hz 2: 2560x1440~60Hz
OpenGL: renderer: AMD RENOIR (DRM 3.40.0 5.11.17-200.fc33.x86_64 LLVM 11.0.0) v: 4.6 Mesa 20.3.5 direct render: Yes
Audio: Device-1: Advanced Micro Devices [AMD/ATI] vendor: Lenovo driver: snd_hda_intel v: kernel bus-ID: 06:00.1
Device-2: Advanced Micro Devices [AMD] Raven/Raven2/FireFlight/Renoir Audio Processor vendor: Lenovo
driver: snd_rn_pci_acp3x v: kernel bus-ID: 06:00.5
Device-3: Advanced Micro Devices [AMD] Family 17h HD Audio vendor: Lenovo driver: snd_hda_intel v: kernel
bus-ID: 06:00.6
Sound Server-1: ALSA v: k5.11.17-200.fc33.x86_64 running: yes
Sound Server-2: JACK v: 1.9.14 running: no
Sound Server-3: PulseAudio v: 14.0-rebootstrapped running: yes
Sound Server-4: PipeWire v: 0.3.26 running: yes
Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Lenovo driver: r8169 v: kernel port: 2400
bus-ID: 02:00.0
IF: enp2s0f0 state: down mac:
Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel port: 2000 bus-ID: 03:00.0
IF: wlp3s0 state: up mac:
IF-ID-1: virbr0 state: down mac:
IF-ID-2: virbr0-nic state: down mac:
Bluetooth: Device-1: Intel AX200 Bluetooth type: USB driver: btusb v: 0.8 bus-ID: 6-4:5
Report: ID: hci0 state: up address: bt-v: 3.0 lmp-v: 5.2
Drives: Local Storage: total: 1.38 TiB used: 684.48 GiB (48.6%)
ID-1: /dev/mmcblk0 model: SC512 size: 476.71 GiB
ID-2: /dev/nvme0n1 vendor: Western Digital model: WDS100T2B0C-00PXH0 size: 931.51 GiB temp: 42.9 C
Partition: ID-1: / size: 97.91 GiB used: 16.82 GiB (17.2%) fs: ext4 dev: /dev/dm-1 mapped: lvm–os-lvm–os–root
ID-2: /boot size: 1.91 GiB used: 283.4 MiB (14.5%) fs: ext4 dev: /dev/nvme0n1p1
ID-3: /boot/efi size: 2.99 GiB used: 20.4 MiB (0.7%) fs: vfat dev: /dev/nvme0n1p2
ID-4: /home size: 300 GiB used: 45.42 GiB (15.1%) fs: btrfs dev: /dev/dm-4 mapped: lvm–os-lvm–btrfs–home
Swap: ID-1: swap-1 type: partition size: 20 GiB used: 0 KiB (0.0%) dev: /dev/dm-2 mapped: lvm–os-lvm–os–swap
ID-2: swap-2 type: zram size: 4 GiB used: 95.8 MiB (2.3%) dev: /dev/zram0
Sensors: System Temperatures: cpu: 59.0 C mobo: 0.0 C gpu: amdgpu temp: 59.0 C
Fan Speeds (RPM): cpu: 4600
Info: Processes: 534 Uptime: 8h 04m Memory: 14.92 GiB used: 11.21 GiB (75.1%) Init: systemd runlevel: 5 Compilers:
gcc: 10.3.1 Packages: 5225 Shell: Bash v: 5.0.17 inxi: 3.3.03

[2] suspicious lines in dmesg after a successful resume

Summary

[28380.222521] [drm] Fence fallback timer expired on ring sdma0
[28381.782651] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.0.0 (-110).
[28382.806536] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.1.0 (-110).
[28383.830646] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.2.0 (-110).
[28384.854633] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.3.0 (-110).
[28385.878542] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.0.1 (-110).
[28386.902644] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.1.1 (-110).
[28387.926661] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.2.1 (-110).
[28388.950637] amdgpu 0000:06:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.3.1 (-110).
[28388.960377] [drm:process_one_work] ERROR ib ring test failed (-110).
[28389.019845] PM: resume devices took 9.508 seconds

Hi. In the Lenovo forums there’s this thread English Community-Lenovo Community