Issues with AMD GPU

I installed Fedora 41 Workstation today because I got an AMD Radeon RX 7800 XT after using an NVIDIA GPU for the past 2 years.

I noticed 2 issues:

There is this graphical glitch which also appears in screenshots so I am ruling out a hardware defect
image

2nd issue is that after trying to wake up the computer from sleep, the screens stay black saying no signal, but the keyboard and mouse light up and the computer engages the fans and the watercooling pump. I tried setting amdgpu.runpm=0 with no luck. Windows 11 24H2 wakes up fine.

Help appreciated.

System Details Report


Report details

  • Date generated: 2024-11-02 20:35:45

Hardware Information:

  • Hardware Model: Micro-Star International Co., Ltd. MS-7D70
  • Memory: 64.0 GiB
  • Processor: AMD Ryzen™ 9 7950X × 32
  • Graphics: AMD Radeon™ RX 7800 XT
  • Disk Capacity: 5.0 TB

Software Information:

  • Firmware Version: 1.M1
  • OS Name: Fedora Linux 41 (Workstation Edition)
  • OS Build: (null)
  • OS Type: 64-bit
  • GNOME Version: 47
  • Windowing System: Wayland
  • Kernel Version: Linux 6.11.5-300.fc41.x86_64

Had the same wake problem, Following information found from many searches: I have msi motherboard. In bios found there is a wake setup. Wake on lan, or usb, something like that. Last option was to enable OS. That did it! No more problems with restore from sleep!

I tried that too but it didn’t have any effect

Power management requires coordination between vendor firmware and linux, so new kernels may require updated vendor firmware. The ACPI open standard tells linux how to discover and configure the hardware. Some vendors support acpi_osi=linux on the kernel command line, others may need system-dependent entries.

I don’t think it’s motherboard related since it worked fine under F40 + Nvidia a few weeks ago

The glitch is possibly a result of a driver bug relating to power management:

If that is the cause, you could disable this PM feature by adding this kernel boot parameter:

amdgpu.dcdebugmask=0x10

Please note that disabling this feature could increase power usage slightly, but since you are not on a laptop that probably is not a problem.

1 Like

I tried that and these glitches don’t go away, I must note that these only appear on the title bar on the Terminal when the window is unfocused (I just found that out while trying)

I have the same problem. The solution doesn’t work.

I’ve tried acpi_osi=linux as well, and that one didn’t work either.

This is my configuration:

System:
  Kernel: 6.11.5-300.fc41.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.43.1-2.fc41
  Desktop: GNOME v: 47.1 tk: GTK v: 3.24.43 wm: gnome-shell dm: GDM
    Distro: Fedora Linux 41 (Workstation Edition)
Machine:
  Type: Laptop System: LENOVO product: 82MS v: Yoga Slim 7 Pro 14ACH5
    serial: <superuser required> Chassis: type: 10 v: Yoga Slim 7 Pro 14ACH5
    serial: <superuser required>
  Mobo: LENOVO model: LNVNB161216 v: SDK0T76461WIN
    serial: <superuser required> part-nu: LENOVO_MT_82MS_BU_idea_FM_Yoga Slim 7
    Pro 14ACH5 UEFI: LENOVO v: GZCN23WW date: 10/11/2021
Battery:
  ID-1: BAT0 charge: 54.1 Wh (99.3%) condition: 54.5/61.0 Wh (89.4%)
    volts: 17.4 min: 15.4 model: Sunwoda L19D4PH3 serial: <filter>
    status: not charging
CPU:
  Info: 8-core model: AMD Ryzen 9 5900HX with Radeon Graphics bits: 64
    type: MT MCP arch: Zen 3 rev: 0 cache: L1: 512 KiB L2: 4 MiB L3: 16 MiB
  Speed (MHz): avg: 400 min/max: 400/4680 boost: enabled cores: 1: 400
    2: 400 3: 400 4: 400 5: 400 6: 400 7: 400 8: 400 9: 400 10: 400 11: 400
    12: 400 13: 400 14: 400 15: 400 16: 400 bogomips: 105400
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Cezanne [Radeon Vega Series /
    Radeon Mobile Series] vendor: Lenovo driver: amdgpu v: kernel arch: GCN-5
    pcie: speed: 8 GT/s lanes: 16 ports: active: DP-3,eDP-1
    empty: DP-1, DP-2, DP-4, DP-5 bus-ID: 03:00.0 chip-ID: 1002:1638
    temp: 45.0 C
  Device-2: IMC Networks Integrated Camera driver: uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 1-3:3 chip-ID: 13d3:5419
  Display: wayland server: X.org v: 1.21.1.14 with: Xwayland v: 24.1.4
    compositor: gnome-shell driver: X: loaded: modesetting alternate: fbdev,vesa
    dri: radeonsi gpu: amdgpu display-ID: 0
  Monitor-1: DP-3 model: Samsung U28E570 res: 3840x2160 dpi: 160
    diag: 699mm (27.5")
  Monitor-2: eDP-1 model: BOE Display 0x0931 res: 2240x1400 dpi: 188
    diag: 356mm (14")
  API: OpenGL v: 4.6 vendor: amd mesa v: 24.2.5 glx-v: 1.4 es-v: 3.2
    direct-render: yes renderer: AMD Radeon Graphics (radeonsi renoir LLVM
    19.1.0 DRM 3.59 6.11.5-300.fc41.x86_64) device-ID: 1002:1638
    display-ID: :0.0
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
Audio:
  Device-1: Advanced Micro Devices [AMD/ATI] Renoir Radeon High Definition
    Audio vendor: Lenovo driver: snd_hda_intel v: kernel pcie: speed: 8 GT/s
    lanes: 16 bus-ID: 03:00.1 chip-ID: 1002:1637
  Device-2: Advanced Micro Devices [AMD] ACP/ACP3X/ACP6x Audio Coprocessor
    vendor: Lenovo driver: N/A pcie: speed: 8 GT/s lanes: 16 bus-ID: 03:00.5
    chip-ID: 1022:15e2
  Device-3: Advanced Micro Devices [AMD] Family 17h/19h HD Audio
    vendor: Lenovo driver: snd_hda_intel v: kernel pcie: speed: 8 GT/s lanes: 16
    bus-ID: 03:00.6 chip-ID: 1022:15e3
  API: ALSA v: k6.11.5-300.fc41.x86_64 status: kernel-api
  Server-1: JACK v: 1.9.22 status: off
  Server-2: PipeWire v: 1.2.6 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
Network:
  Device-1: MEDIATEK MT7921 802.11ax PCI Express Wireless Network Adapter
    vendor: Lenovo driver: mt7921e v: kernel pcie: speed: 5 GT/s lanes: 1
    bus-ID: 01:00.0 chip-ID: 14c3:7961
  IF: wlp1s0 state: up mac: <filter>
  Device-2: ASIX AX88179 Gigabit Ethernet driver: cdc_ncm type: USB rev: 3.2
    speed: 5 Gb/s lanes: 1 bus-ID: 4-1.3.2:5 chip-ID: 0b95:1790
  IF: eth0 state: down mac: <filter>
  IF-ID-1: br-2b8b6a842b25 state: down mac: <filter>
  IF-ID-2: br-c227916573d8 state: down mac: <filter>
  IF-ID-3: docker0 state: down mac: <filter>
Bluetooth:
  Device-1: Foxconn / Hon Hai MediaTek Bluetooth Adapter driver: btusb v: 0.8
    type: USB rev: 2.1 speed: 480 Mb/s lanes: 1 bus-ID: 3-4:3 chip-ID: 0489:e0cd
  Report: btmgmt ID: hci0 rfk-id: 2 state: up address: <filter> bt-v: 5.2
    lmp-v: 11
Drives:
  Local Storage: total: 968.19 GiB used: 597.87 GiB (61.8%)
  ID-1: /dev/nvme0n1 vendor: SK Hynix model: HFS001TDE9X084N
    size: 953.87 GiB speed: 31.6 Gb/s lanes: 4 serial: <filter> temp: 41.9 C
  ID-2: /dev/sda vendor: SanDisk model: Ultra size: 14.32 GiB type: USB
    rev: 3.0 spd: 5 Gb/s lanes: 1 serial: <filter>
Partition:
  ID-1: / size: 952.28 GiB used: 597.45 GiB (62.7%) fs: btrfs
    dev: /dev/nvme0n1p3
  ID-2: /boot size: 973.4 MiB used: 412.2 MiB (42.3%) fs: ext4
    dev: /dev/nvme0n1p2
  ID-3: /boot/efi size: 598.8 MiB used: 19.3 MiB (3.2%) fs: vfat
    dev: /dev/nvme0n1p1
  ID-4: /home size: 952.28 GiB used: 597.45 GiB (62.7%) fs: btrfs
    dev: /dev/nvme0n1p3
Swap:
  ID-1: swap-1 type: file size: 16 GiB used: 0 KiB (0.0%) priority: -2
    file: /swap
  ID-2: swap-2 type: zram size: 13.5 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 51.8 C mobo: 40.0 C gpu: amdgpu temp: 45.0 C
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 16 GiB note: est. available: 13.5 GiB used: 5.48 GiB (40.6%)
  Processes: 490 Power: uptime: 4m wakeups: 0 Init: systemd v: 256
    target: graphical (5) default: graphical
  Packages: pm: rpm pkgs: N/A note: see --rpm pm: flatpak pkgs: 36
    Compilers: gcc: 14.2.1 Shell: Zsh v: 5.9 running-in: tilix inxi: 3.3.36

I think it might be related to the kernel and amdgpu

Something bizzare just happened, while recording using GNOME’s screenshotting util, the AMDGPU driver resetted the GPU which resulted in GNOME and XWayland crashing. I think the amdgpu driver is faulty

To test that, you could try running your PC without that driver loaded and see if any of the problems continue to happen. Your video might be limited in many ways (resolution, refresh rate, bit depth, etc.), but if XWayland still crashes, then you would know that it is not because of the amdgpu driver. To run your PC without the amdgpu driver loaded, add rd.driver.blacklist=amdgpu and modprobe.blacklist=amdgpu to your list of kernel parameters.

You can confirm that the driver isn’t loaded by looking at the output from lsmod and/or lspci -k. (The lspci -k output should not show Kernel driver in use: amdgpu.)

2 Likes

Thanks, will try that when I get home

So I blacklisted the driver and the artifact disappeared. The sleep issue still persisted. 3.11.6 came out and it still didn’t fix the issues

1 Like

So the problem with the screen not coming back up is probably due to something going wrong deeper in the kernel. I see an old report where someone said the kernel parameter acpi_sleep=s3_bios resolved that sort of problem. [1] Does that parameter workaround the resume from suspend problem for your case?

Reading further, that workaround is unlikely to work on UEFI systems. I’ll keep looking.


  1. https://bugs.freedesktop.org/show_bug.cgi?id=42960#c20 ↩︎

odd thing is that there are no journald logs.

I could try installing another Linux distro to see if it works there

There are some reports of issues that look similar to yours here: Issues · drm / amd · GitLab

It’s on the AMD issue tracker, but some of them are reporting that the problems appear to be happening in the kernel power management code. I didn’t dig through them thoroughly, but you might find a workaround in one of those bug reports.

These are issues related to S4 sleep, I’m having issues with S3 sleep. But yeah it could indicate that something is off with the kernel’s power management

There is a tag for that too: https://gitlab.freedesktop.org/drm/amd/-/issues?label_name=S3 :slightly_smiling_face:

Edit: Also, it’s more to dig through, but FWIW, here is a link to all open power management bugs reported at kernel.org: https://bugzilla.kernel.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&order=changeddate%20DESC%2Cbug_status%2Cpriority%2Cassigned_to%2Cbug_id&product=Power%20Management&query_format=advanced&short_desc=.*&short_desc_type=regexp

Added amdgpu, f41, radeon, suspend-resume, workstation

Power management needs support from vendor firmware, so it is not unusual to find vendor firmware bugs causing kernel issues.