AMD dGPU RX 6400 screen keeps disconnecting

So this is the most strange behaviour that I’ve been unable to track down to date. And that’s saying something when I have a haswell + Nvidia in another rig.

Before I added the RX 6400 I could run 4k@30Hz no problem, no issues, I only had issues with vaapi on 6.10+ kernels.

But now adding the RX 6400, Man what a ride.

The display didn’t want to work at all at first, I had to connect the hdmi cable to the motherboards hdmi and then add amdgpu.runpm=0 to make the TV and the RX 6400 play nice.

Plymouth totally broke, TV shows no input connected, had to force a resolution with video=1920x1080@30 as a kernel flag (wasn’t needed with the kaveri).

Now I’m randomly getting the TV disconnecting (shows no input detected), which is only momentary, and seems totally unrelated to anything I do or don’t do. And I can’t find a single trace of anything in any logs related to the event (I had journalctl -p 4 -f --grep "kwin|plasmashell|amdgpu|drm" running on screen while waiting for it to drop again, and literally nothing.

The only things I’m seeing, which are not correlating to the random disconnections are:

  • kwin moaning about Atomic Mode Setting failed
  • kwin warnings for unsupported GL_BACK_LEFT).
  • amdgpu 0000:00:01.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd (-110)

Any clues on where to go hunting?
And hopefully where i should be logging bugs.

System info:

System:
  Kernel: 6.11.3-200.fc40.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.41-37.fc40
  Desktop: KDE Plasma v: 6.2.0 Distro: Fedora Linux 40 (KDE Plasma)
Machine:
  Type: Desktop System: MSI product: MS-7969 v: 1.0
    serial: <superuser required>
  Mobo: MSI model: A68HI AC (MS-7969) v: 1.0 serial: <superuser required>
    UEFI: American Megatrends v: 1.3 date: 04/13/2016
Battery:
  Device-1: hidpp_battery_0 model: Logitech Wireless Touch Keyboard K400 Plus
    charge: 100% (should be ignored) status: discharging
CPU:
  Info: quad core model: AMD A10-7800 Radeon R7 12 Compute Cores 4C+8G
    bits: 64 type: MCP arch: Steamroller rev: 1 cache: L1: 256 KiB L2: 4 MiB
  Speed (MHz): avg: 1400 min/max: 1400/3500 boost: enabled cores: 1: 1400
    2: 1400 3: 1400 4: 1400 bogomips: 27946
  Flags: avx ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Kaveri [Radeon R7 Graphics]
    vendor: Micro-Star MSI driver: amdgpu v: kernel arch: GCN-2 bus-ID: 00:01.0
  Device-2: Advanced Micro Devices [AMD/ATI] Navi 24 [Radeon RX 6400/6500
    XT/6500M] vendor: XFX driver: amdgpu v: kernel arch: RDNA-2
    bus-ID: 03:00.0
  Display: wayland server: Xwayland v: 24.1.3 compositor: kwin_wayland
    driver: N/A resolution: 1365x720
  API: EGL v: 1.5 drivers: radeonsi,swrast platforms:
    active: gbm,wayland,x11,surfaceless,device inactive: N/A
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.1.7 glx-v: 1.4
    direct-render: yes renderer: AMD Radeon RX 6400 (radeonsi navi24 LLVM
    18.1.6 DRM 3.59 6.11.3-200.fc40.x86_64)
  API: Vulkan v: 1.3.290 drivers: N/A surfaces: xcb,xlib,wayland devices: 3
Audio:
  Device-1: Advanced Micro Devices [AMD/ATI] Kaveri HDMI/DP Audio
    vendor: Micro-Star MSI driver: snd_hda_intel v: kernel bus-ID: 00:01.1
  Device-2: Advanced Micro Devices [AMD] FCH Azalia vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel bus-ID: 00:14.2
  Device-3: Advanced Micro Devices [AMD/ATI] Navi 21/23 HDMI/DP Audio
    driver: snd_hda_intel v: kernel bus-ID: 03:00.1
  Device-4: Meridian Explorer USB DAC driver: snd-usb-audio type: USB
    bus-ID: 6-2:3
  API: ALSA v: k6.11.3-200.fc40.x86_64 status: kernel-api
  Server-1: PipeWire v: 1.0.8 status: active
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    vendor: Micro-Star MSI driver: r8169 v: kernel port: d000 bus-ID: 04:00.0
  IF: enp4s0 state: up speed: 100 Mbps duplex: full mac: <filter>
  IF-ID-1: waylanbr0 state: down mac: <filter>
Drives:
  Local Storage: total: 931.51 GiB used: 55.26 GiB (5.9%)
  ID-1: /dev/sda vendor: Samsung model: SSD 860 EVO 1TB size: 931.51 GiB
Partition:
  ID-1: / size: 50 GiB used: 10.39 GiB (20.8%) fs: btrfs dev: /dev/sda7
  ID-2: /boot size: 1.9 GiB used: 413.8 MiB (21.3%) fs: ext4 dev: /dev/sda2
  ID-3: /boot/efi size: 511 MiB used: 19 MiB (3.7%) fs: vfat dev: /dev/sda1
  ID-4: /home size: 681.01 GiB used: 35.86 GiB (5.3%) fs: btrfs
    dev: /dev/sda6
  ID-5: /tmp size: 100 GiB used: 8.59 GiB (8.6%) fs: btrfs dev: /dev/sda5
  ID-6: /var/log size: 100 GiB used: 8.59 GiB (8.6%) fs: btrfs
    dev: /dev/sda5
  ID-7: /var/tmp size: 100 GiB used: 8.59 GiB (8.6%) fs: btrfs
    dev: /dev/sda5
Swap:
  ID-1: swap-1 type: zram size: 6.72 GiB used: 0 KiB (0.0%) dev: /dev/zram0
  ID-2: swap-2 type: partition size: 48 GiB used: 0 KiB (0.0%)
    dev: /dev/sda3
Sensors:
  System Temperatures: cpu: 58.0 C mobo: 40.0 C
  Fan Speeds (rpm): cpu: 0 fan-2: 1718 fan-3: 0
  GPU: device: amdgpu temp: 56.0 C fan: 1032 device: amdgpu temp: 42.0 C
  Power: 12v: N/A 5v: N/A 3.3v: 3.33 vbat: 3.25
Info:
  Memory: total: 8 GiB note: est. available: 6.72 GiB used: 1.89 GiB (28.1%)
  Processes: 298 Uptime: 2m Init: systemd target: graphical (5)
  Packages: 26 note: see --rpm Compilers: gcc: 14.2.1 Shell: Bash v: 5.2.26
    inxi: 3.3.36

Note: the RX 6400 is PCIE4 and this motherboard is PCIE3, and I do have amdgpu enabled for Kaveri.

Oh and this is with kernels: 6.9.12, 6.10.12
and 6.11.3.

Random other kernel flags I tried:

  • amdgpu_aspm=0 made disconnects more frequent
  • amdgpu.dpm=0no display past grub.

Also I realise my tone is very moanie, but that’s not directed at anyone here.

Disabling the iGPU in BIOS also makes no difference. Which makes me think the iGPU has nothing to do with this.

The problems you describe sound like low-level kernel/driver problems to me. It looks like Fedora Linux 40 was released with kernel 6.8.5. Can you boot your system from a Fedora Linux 40 live image with the original 6.8.5 kernel and test if your hardware works with that kernel? Other than that, I might be tempted to try disabling even more power management (completely in the BIOS or maybe just for all PCIE devices).

Any hints on which flags i should be looking at? I was kind of blind firing.

I’d try pcie_aspm=off[1] but that is just a shot in the dark.


  1. https://www.kernel.org/doc/html/v6.11/admin-guide/kernel-parameters.html ↩︎

Already did… Didn’t help, but I’ll try adding more and more instead of individually testing each.

Wondering if it’s appropriate to log this on the AMD drm gitlab?

Yeah, I’ve seen a few reports of problems with amdgpu on recent kernels: Getting Occasional Graphical Bugs / Artifacting After Switching to AMD - #2 by glb

I wonder if AMD’s unstable HDMI goes beyond Polaris? For a 4K@60Hz display with several RX 580s on different computers, I had to reduce-blank that resolution on HDMI on my shortest cable, whereas it wasn’t an issue from a 1060 and 3060 as-is 15ft cable.

Here’s a boot option:

video='HDMI-A-1:1920x1080MR@30'

Iirc M adds the custom refresh rate/res regardless if it’s in the real display EDID, and R reduce-blanks it (to lower the bandwidth and potentially improve stability)

1 Like

Will give this a shot! (and report back, thank you)

Also I did end up opening a ticket, after narrowing down the behaviour, and getting rid of the ring buffer & UVD initialisation errors just by tightening & loosening different bolts to get the card fitting into the slot more stably.