[AMD] RX 9060XT Persistent Ring Timeout / GPU Reset on Fedora 43

Hi guys!
I recently changed from a 3060ti to a RX9060XT. It was pretty smooth until monday.
Started to work and opened gemini-cli, turns out i’m getting some nasty errors that freeze my entire screen (i could still hear my team talking on google meet and etc, but my screen froze). Asked AI for some help, some fixes made it better but it just take a few seconds more to bug everything out.
this is the error log:

amdgpu: ring gfx_0.0.0 timeout, signaled seq=XXXXX
amdgpu: Process plasmashell pid XXXX thread plasmashell:cs0
amdgpu: Starting gfx_0.0.0 ring reset
amdgpu: Ring gfx_0.0.0 reset succeeded
[drm] device wedged, but recovered through reset

these are my specs

  • GPU: AMD Radeon RX 9060XT (gfx1200).
  • OS: Fedora 43.
  • Kernel: 6.18.6-200.fc43.x86_64
  • Desktop Environment: KDE Plasma (Wayland).
  • Display Server: Wayland.
  • Mesa: Mesa 25.2.7

What I tried:

  1. Added amdgpu.sg_display=0 to kernel arguments (helpeda bit).
  2. Tried amdgpu.dpm=1 and amdgpu.aspm=0
  3. Increased amdgpu.lockup_timeout=10000.
  4. Removed amdgpu.mes=0 (disabling MES caused immediate timeouts).
  5. Tested with Foot (Wayland native) and Newelle (GTK4), but the issue persists when dragging windows or during heavy text rendering.

Is there any fix to this? Are other people using newer graphic card with this bug? Should I reinstall fedora after switching GPU?
Btw, on windows it works normally. Tested a few minutes ago and there are any hangups

I have the same graphics card, and have also experienced frequent hard freezes on F43. When you pull logs, do you see the following error?

amdgpu 0000:0f:00.0: [drm] *ERROR* [CRTC:86:crtc-0] flip_done timed out

If so, see my related posts, which include a potential fix:

Also see related post below:

Hey Benjamin, thanks for the reply! I’ll give it a try tomorrow. And also provide a more robust log! I’m curious that this crash is happening to fedora 43, since I’ve been working a lot with AI on terminal I might have to try fedora 42 (I hope not!)

Well, seems that this error has been fixed, no idea how because I’ve got some updates of fedora and gemini-cli updated like 3 times since this post. I’ll test with local-llms because this was also crashing my entire pc

Are you saying the issue has been resolved without implementing any fixes / troubleshooting? How long has your system been running stable?

yes, before asking this question to gemini-cli would freeze my entire system. This simple prompt would crash everything:


And now, I’m running local-llm via llama.cpp with no problem (which previously froze my system when I started the server on the terminal)

I didn’t apply any of the fixes (although I’ve read them all). It was a busy week and today I got some spare time to debug this. Turns out everything is working (at least, it seems to be working).

It was a funny bug, even running an external agent on my IDE would freeze my entire system.
I’ve had a good week coding without any AI :sweat_smile:

Got some issues installing ollama…

Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: failed to suspend display audio
Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State
Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: Dumping IP State Completed
Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: MODE1 reset
Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: GPU mode1 reset
Feb 04 09:55:18 fedora kernel: amdgpu 0000:09:00.0: amdgpu: GPU smu mode1 reset
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset succeeded, trying to resume
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: PCIE GART of 512M enabled (table at 0x0000008000000000).
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: [drm] Check your /sys/class/drm/card1/device/devcoredump/data
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: VRAM is lost due to GPU reset!
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: PSP is resuming...
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: RAS: optional ras ta ucode is not available
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: RAP: optional rap ta ucode is not available
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: SECUREDISPLAY: optional securedisplay ta ucode is not available
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: SMU is resuming...
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: smu driver if version = 0x0000002e, smu fw if version = 0x00000032, smu fw program = 0, smu fw version = 0x00664500 (102.69.0)
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: SMU driver if version not matched
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: SMU is resumed successfully!
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: program CP_MES_CNTL : 0x4000000
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: program CP_MES_CNTL : 0xc000000
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: [drm] DMUB hardware initialized: version=0x0A000700
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring sdma0 uses VM inv eng 9 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring sdma1 uses VM inv eng 10 on hub 0
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: amdgpu: GPU reset(1) succeeded!
Feb 04 09:55:19 fedora kernel: amdgpu 0000:09:00.0: [drm] device wedged, but recovered through reset

My guess is that ollama uses ROCm. I think I’m deviating the topic since this wasn’t the main problem