Hello everyone! I encountered a problem where the amd integrated graphics card was not working properly (I think so). My devices are amd 6750xt,7800x3d and fedora43.
Sometimes when I’m using vscode, it suddenly freezes. One of the vscode threads becomes a zombie thread and cannot be killed. At this point, through btop, it can be seen that the graphics card is working normally, but the memory of the integrated graphics card remains at about 200M, even if vscode is turned off. Under normal circumstances, enabling and disabling vscode will affect the video memory of the integrated graphics. At this point, vscode cannot be reopened either. After adding --disable-gpu, it can be run.
But more importantly, at this time, it is impossible to restart the desktop, restart the computer or shut down the computer. As long as the above operations are performed, the host will not be shut down, but there will be no input on the display.
AI help me searching these log:
- device ID: 13:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Raphael [1002:164e] (rev cb)
- Micro-Star International Co., Ltd. [MSI] Device [1462:7d76]
- amdgpu
GPU reset failed
12月 09 21:17:57 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000017 SMN_C2PMSG_82:0x00000000
12月 09 21:17:57 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Failed to disable gfxoff!
12月 09 21:18:01 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000017 SMN_C2PMSG_82:0x00000000
12月 09 21:18:01 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Failed to disable smu features.
12月 09 21:18:02 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: MODE2 reset
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000017 SMN_C2PMSG_82:0x00000000
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Failed to mode reset!
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Mode2 reset failed!
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: ASIC reset failed with error, -62 for drm dev, 0000:13:00.0
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset end with ret = -62
12月 09 21:18:06 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: GPU Recovery Failed: -62
GPU coredump
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Dumping IP State
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Dumping IP State Completed
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: [drm] AMDGPU device coredump file has been created
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: [drm] Check your /sys/class/drm/card0/device/devcoredump/data
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=47934, emitted seq=47934
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Starting gfx_0.0.0 ring reset
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: Ring gfx_0.0.0 reset failed
12月 09 21:18:16 192.168.1.5 kernel: amdgpu 0000:13:00.0: amdgpu: GPU reset begin!
I’m not sure whether these logs are enough. Could someone tell me how to solve this?