Where to report AMD GPU lockups crasher/hangs bugs related to seemingly triggered by Firefox / VA-API after resuming from suspend?

Since June 24th (or maybe a few days earlier) I’ve been experiencing debilitating AMD radeonsi (my card is a “Pitcairn” R9 270 model) GPU lockups on my main Fedora 35 workstation, running Xorg GNOME with the default open source AMD drivers and the default Firefox package provided by Fedora, fully up to date. Typically when opening a page in a new tab in Firefox, particularly (or always?) when the page contains a video (ex: if it’s a YouTube tab for example).

The system then locks up solidly, with this typical dreaded error in journalctl:

jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10082msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000001761 last fence id 0x0000000000001762 on ring 5)
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 10120msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000002855a last fence id 0x0000000000028565 on ring 0)
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 10080msec
jun 30 13:22:45 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000007dc7 last fence id 0x0000000000007dcd on ring 3)
jun 30 13:22:46 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10585msec
jun 30 13:22:46 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000001761 last fence id 0x0000000000001762 on ring 5)

…etc.

Only a full reboot via SSH works (and even then, it takes forever to do so, because you have to wait for systemd to “give up” waiting for Firefox and the filesystems to unmount at the end).

Those ring stalled GPU lockup errors are immediately preceeded by these, so I’m not sure if it’s actually caused by the VA-API implementation in Firefox, or if it’s just triggered by it and the bug is in mesa/the kernel/etc.:

jun 30 13:22:32 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:32 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:32 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:33 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:33 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:35 workstation gnome-shell[5654]: [2022-06-30T17:22:35Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:35 workstation rtkit-daemon[1456]: Recovering from system lockup, not allowing further RT threads.
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: VA-API version 1.13.0
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: Found init function __vaDriverInit_1_13
jun 30 13:22:35 workstation gnome-shell[4682]: ATTENTION: default value of option mesa_glthread overridden by environment.
jun 30 13:22:35 workstation gnome-shell[4682]: libva info: va_openDriver() returns 0
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
jun 30 13:22:36 workstation gnome-shell[5654]: [2022-06-30T17:22:36Z ERROR mp4parse] Found 2 nul bytes in "\0\0"

At first I thought, maybe Mesa 21.3.9 fixes this, since it reportedly fixes “a crash in radeonsi driver”, but nope, it still occurs with that version.

Now after days of headbanging, trying different kernels, trying with the amdgpu.dpm=0 kernel boot option, etc., I think I narrowed down the bug trigger to these conditions, which makes it nearly 100% reproducible for me:

  • The system must have been suspended (put to sleep) once, then resumed
  • The system must be running in the Xorg version of GNOME; much to my surprise, the hang doesn’t seem to occur when running under the Wayland version of GNOME // Update: it does happen with Wayland too.
  • The issue is then triggered by trying to load a YouTube video tab (or play a video in an existing tab)

My question to you now is: where do I file a bug about this?

As you can see, the main issue is that whenever I encounter GPU lockups, I’m never sure who is the culprit: upstream, downstream, Firefox, Mesa, Mutter/GNOME-Shell, Xorg vs Wayland, the Linux kernel, etc. so I’m at a loss as to where the bug report should effectively go. Fedora’s "How to file a bug guide (if that’s the right place to look in) doesn’t have a section explaining what part of this complex middleware+userland mix is to blame, and how to triage/troubleshoot those types of mandelbugs.

If I didn’t miss something obvious here, and unless the ask.fedora forums is the main place to do the initial troubleshooting, then maybe this is an opportunity for the Fedora community to improve its guidance on how to report those types of bugs? :thinking:

1 Like

Update: I spoke too fast / was too optimistic. Although I thought the issue had stopped occurring when running under Wayland, it turns out it still happens, maybe just less frequently? After letting the computer auto-suspend and waking it up, I tried resuming playback of a YouTube video and it immediately froze the system; the screen went black after a few seconds. Still accessible from SSH as usual, below is what I saw in the logs.

When the computer goes to sleep by itself:

Jun 30 18:14:30 workstation kernel: Freezing user space processes ... (elapsed 0.003 seconds) done.
Jun 30 18:14:30 workstation kernel: OOM killer disabled.
Jun 30 18:14:30 workstation kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Jun 30 18:14:30 workstation kernel: printk: Suspending console(s) (use no_console_suspend to debug)
Jun 30 18:14:30 workstation kernel: serial 00:03: disabled
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: parport_pc 00:02: disabled
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Stopping disk
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Synchronizing SCSI cache
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Stopping disk
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Stopping disk
Jun 30 18:14:30 workstation kernel: PM: suspend devices took 7.421 seconds
Jun 30 18:14:30 workstation kernel: ACPI: PM: Preparing to enter system sleep state S3
Jun 30 18:14:30 workstation kernel: ACPI: PM: Saving platform NVS memory
Jun 30 18:14:30 workstation kernel: Disabling non-boot CPUs ...
Jun 30 18:14:30 workstation kernel: smpboot: CPU 1 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 2 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 3 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 4 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 5 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 6 is now offline
Jun 30 18:14:30 workstation kernel: smpboot: CPU 7 is now offline

When waking the system from sleep suspend:

Jun 30 18:14:30 workstation kernel: ACPI: PM: Low-level resume complete
Jun 30 18:14:30 workstation kernel: ACPI: PM: Restoring platform NVS memory
Jun 30 18:14:30 workstation kernel: Enabling non-boot CPUs ...
Jun 30 18:14:30 workstation kernel: x86: Booting SMP configuration:
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 1 APIC 0x2
Jun 30 18:14:30 workstation kernel: CPU1 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 2 APIC 0x4
Jun 30 18:14:30 workstation kernel: CPU2 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 3 APIC 0x6
Jun 30 18:14:30 workstation kernel: CPU3 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 4 APIC 0x1
Jun 30 18:14:30 workstation kernel: CPU4 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 5 APIC 0x3
Jun 30 18:14:30 workstation kernel: CPU5 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 6 APIC 0x5
Jun 30 18:14:30 workstation kernel: CPU6 is up
Jun 30 18:14:30 workstation kernel: smpboot: Booting Node 0 Processor 7 APIC 0x7
Jun 30 18:14:30 workstation kernel: CPU7 is up
Jun 30 18:14:30 workstation kernel: ACPI: PM: Waking up from system sleep state S3
Jun 30 18:14:30 workstation kernel: usb usb3: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb4: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb5: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb6: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb7: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: usb usb8: root hub lost power or was reset
Jun 30 18:14:30 workstation kernel: sd 0:0:0:0: [sda] Starting disk
Jun 30 18:14:30 workstation kernel: sd 1:0:0:0: [sdb] Starting disk
Jun 30 18:14:30 workstation kernel: sd 3:0:0:0: [sdc] Starting disk
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Link is down
Jun 30 18:14:30 workstation kernel: parport_pc 00:02: activated
Jun 30 18:14:30 workstation kernel: [drm] PCIE gen 2 link speeds already enabled
Jun 30 18:14:30 workstation kernel: [drm] PCIE GART of 2048M enabled (table at 0x00000000001D6000).
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: WB enabled
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jun 30 18:14:30 workstation kernel: serial 00:03: activated
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 6 use gpu addr 0x0000000080000c18
Jun 30 18:14:30 workstation kernel: radeon 0000:02:00.0: fence driver on ring 7 use gpu addr 0x0000000080000c1c
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: [drm] ring test on 0 succeeded in 3 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 1 succeeded in 1 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 2 succeeded in 1 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 3 succeeded in 6 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 4 succeeded in 5 usecs
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_uvd' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: usb 8-1: reset full-speed USB device number 2 using uhci_hcd
Jun 30 18:14:30 workstation kernel: usb 3-2: reset full-speed USB device number 3 using uhci_hcd
Jun 30 18:14:30 workstation kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata6: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 30 18:14:30 workstation kernel: ata5: SATA link down (SStatus 0 SControl 300)
Jun 30 18:14:30 workstation kernel: ata1.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: [drm] ring test on 5 succeeded in 2 usecs
Jun 30 18:14:30 workstation kernel: [drm] UVD initialized successfully.
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_vce1' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: debugfs: File 'radeon_ring_vce2' in directory '0' already present!
Jun 30 18:14:30 workstation kernel: [drm] ring test on 6 succeeded in 14 usecs
Jun 30 18:14:30 workstation kernel: [drm] ring test on 7 succeeded in 4 usecs
Jun 30 18:14:30 workstation kernel: [drm] VCE initialized successfully.
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 1 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 2 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 4 succeeded in 0 usecs
Jun 30 18:14:30 workstation kernel: usb 3-1: reset low-speed USB device number 2 using uhci_hcd
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 5 succeeded
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 6 succeeded
Jun 30 18:14:30 workstation kernel: [drm] ib test on ring 7 succeeded
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Link is up at 1000 Mbps, full duplex
Jun 30 18:14:30 workstation kernel: tg3 0000:05:00.0 enp5s0: Flow control is on for TX and on for RX
Jun 30 18:14:30 workstation kernel: [drm:si_dpm_set_power_state [radeon]] *ERROR* si_set_sw_state failed
Jun 30 18:14:30 workstation kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jun 30 18:14:30 workstation kernel: ata2.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: ata4: link is slow to respond, please be patient (ready=0)
Jun 30 18:14:30 workstation kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Jun 30 18:14:30 workstation kernel: ata4.00: configured for UDMA/133
Jun 30 18:14:30 workstation kernel: PM: resume devices took 7.609 seconds
Jun 30 18:14:30 workstation kernel: OOM killer enabled.
Jun 30 18:14:30 workstation kernel: Restarting tasks ... done.
Jun 30 18:14:30 workstation kernel: PM: suspend exit
Jun 30 18:14:30 workstation kernel: rfkill: input handler enabled

Jun 30 18:14:30 workstation rtkit-daemon[1007]: The canary thread is apparently starving. Taking action.
Jun 30 18:14:30 workstation systemd-resolved[970]: Clock change detected. Flushing caches.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Demoting known real-time threads.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 24438 of process 24267 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd-sleep[27081]: System returned from sleep state.

Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 16380 of process 14136 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd[1]: systemd-suspend.service: Deactivated successfully.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Successfully demoted thread 13894 of process 13727 (/usr/lib64/firefox/firefox).
Jun 30 18:14:30 workstation systemd[1]: Finished System Suspend.
Jun 30 18:14:30 workstation rtkit-daemon[1007]: Demoted 3 threads.
Jun 30 18:14:30 workstation systemd[1]: Stopped target Sleep.
Jun 30 18:14:30 workstation gdm[1101]: GLib: Source ID 91 was not found when attempting to remove it
Jun 30 18:14:30 workstation systemd[1]: Reached target Suspend.
Jun 30 18:14:30 workstation systemd[1]: Stopped target Suspend.
Jun 30 18:14:30 workstation systemd-logind[1010]: Operation 'sleep' finished.
Jun 30 18:14:30 workstation kernel: rfkill: input handler disabled

Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-2
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-2
Jun 30 18:14:30 workstation upowerd[1288]: treating change event as add on /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1
Jun 30 18:14:30 workstation systemd-resolved[970]: enp5s0: Bus client set DNS server list to: fdd0:edd8:b735::1

Jun 30 18:14:30 workstation chronyd[1025]: Forward time jump detected!

Jun 30 18:14:31 workstation gnome-shell[1889]: Object .MetaInputDeviceNative (0x7f9e7c14e0f0), has been already disposed — impossible to get any property from it. This might be caused by the object having been destroyed from C code using something such as destroy(), dispose(), or remove() vfuncs.
Jun 30 18:14:31 workstation gnome-shell[1889]: == Stack trace for context 0x559715bb31f0 ==
Jun 30 18:14:31 workstation gnome-shell[1889]: #0   559715e9ce18 i   resource:///org/gnome/shell/ui/keyboard.js:1175 (21877dcc9470 @ 3)
Jun 30 18:14:31 workstation gnome-shell[27301]: The XKEYBOARD keymap compiler (xkbcomp) reports:
Jun 30 18:14:31 workstation gnome-shell[27301]: > Warning:          Unsupported maximum keycode 708, clipping.
Jun 30 18:14:31 workstation gnome-shell[27301]: >                   X11 cannot support keycodes above 255.
Jun 30 18:14:31 workstation gnome-shell[27301]: Errors from xkbcomp are not fatal to the X server

Jun 30 18:14:33 workstation wireplumber[2058]: <WpSiAudioAdapter:0x565336853250> Object activation aborted: proxy destroyed
Jun 30 18:14:33 workstation wireplumber[2058]: <WpSiAudioAdapter:0x565336853250> failed to activate item: Object activation aborted: proxy destroyed

Jun 30 18:14:34 workstation audit: BPF prog-id=73 op=LOAD
Jun 30 18:14:34 workstation systemd[1]: Starting Fingerprint Authentication Daemon...

Jun 30 18:14:35 workstation gnome-shell[1889]: Timelines with detached actors are not supported

And when the bug happens, upon attempting to play a YouTube video:

Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: VA-API version 1.13.0
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: Found init function __vaDriverInit_1_13
Jun 30 18:14:42 workstation gnome-shell[15182]: libva info: va_openDriver() returns 0
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10080msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 10081msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 10424msec
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:14:53 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 10584msec
[this blah blah gets repeated hundreds of time]

Jun 30 18:15:01 workstation systemd[1]: session-20.scope: Deactivated successfully.
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 18648msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 18992msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 19152msec
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:15:02 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 19153msec
[this blah blah gets repeated a dozen more times]

Jun 30 18:15:04 workstation systemd[1]: fprintd.service: Deactivated successfully.
Jun 30 18:15:04 workstation audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 30 18:15:04 workstation audit: BPF prog-id=0 op=UNLOAD

Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 20664msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000007120a last fence id 0x0000000000071238 on ring 3)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 0 stalled for more than 21009msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 5 stalled for more than 21168msec
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000f3cb last fence id 0x000000000000f3cd on ring 5)
Jun 30 18:15:04 workstation kernel: radeon 0000:02:00.0: ring 3 stalled for more than 21168msec
[this blah blah gets repeated two dozen more times]
Jun 30 18:15:08 workstation kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000192f86 last fence id 0x0000000000192f97 on ring 0)
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0: Saved 1217 dwords of commands on ring 0.
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0: GPU softreset: 0x0000034C
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS               = 0xA0003028
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS               = 0x200A0FC0
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000802
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x800000E3
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44CFC046
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jun 30 18:15:09 workstation kernel: radeon 0000:02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: GRBM_SOFT_RESET=0x0000DDFF
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: SRBM_SOFT_RESET=0x00120500
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS               = 0x00003028
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS               = 0x200806C0
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   SRBM_STATUS2              = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_008680_CP_STAT          = 0x00000000
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jun 30 18:15:10 workstation kernel: radeon 0000:02:00.0: GPU reset succeeded, trying to resume

Jun 30 18:15:15 workstation kernel: [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
Jun 30 18:15:15 workstation kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BFBA (len 254, WS 0, PS 4) @ 0xBFE4
Jun 30 18:15:15 workstation kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B68E (len 94, WS 12, PS 8) @ 0xB6D7
Jun 30 18:15:15 workstation kernel: [drm] PCIE gen 2 link speeds already enabled
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: Wait for MC idle timedout !
Jun 30 18:15:15 workstation kernel: [drm] PCIE GART of 2048M enabled (table at 0x00000000001D6000).
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: WB enabled
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jun 30 18:15:15 workstation kernel: radeon 0000:02:00.0: failed VCE resume (-22).
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jun 30 18:15:15 workstation kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jun 30 18:15:16 workstation gsd-power[2215]: Error setting property 'PowerSaveMode' on interface org.gnome.Mutter.DisplayConfig: Timeout was reached (g-io-error-quark, 24)
Jun 30 18:15:16 workstation kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
Jun 30 18:15:16 workstation kernel: [drm:si_resume [radeon]] *ERROR* si startup failed on resume
1 Like

I’ve experienced some very similar issues, though my use case is slightly different. I have the proprietary AMD OpenCL running alongside mesa (for rendering in Blender using HIP), but I’ve seen terrible lockups that take about 120 seconds to clear up, then my screen (under X window system) is a terrible missmatch of graphic pain.

For me, it happens every time I am utilizing my 6700 XT for rendering in Blender.

I’m seeing something similar. I’m using Fedora 36 on a desktop with an AMD GPU (FirePro W2100) and two monitors. i’ve been using Gnome with Xorg. If I suspend the machine, when I wake it up, everything works fine for at least a few seconds. Then the monitors go blank, go into power save for a second, then wake back up. The display is all black and white noise with some kind of square sprite that moves along with the mouse. No response to keyboard. Haven’t tried SSH-ing in. I’m always using firefox, and it seemed that this morning when I saw this happen, the actual lock up occurred when I clicked on a firefox window.

I see these errors using journalctl

Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:13 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:13 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:14 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: VA-API version 1.14.0
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: Found init function __vaDriverInit_1_14
Jul 01 10:57:14 atreyu gnome-shell[8661]: ATTENTION: default value of option mesa_glthread overridden by environment.
Jul 01 10:57:14 atreyu gnome-shell[8661]: libva info: va_openDriver() returns 0
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.165461:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.171032:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 2 times!
Jul 01 10:57:15 atreyu gnome-shell[24971]: [25010:25010:0701/105715.187573:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 3 times!

followed by

Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: ring 5 stalled for more than 10079msec
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000003b3d5 last fence id 0x000000000003b3d7 on ring 5)
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: Saved 7617 dwords of commands on ring 0.
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0: GPU softreset: 0x0000034D
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0xA7482028
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x69000004
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200A0FC0
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010800
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008802
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x800302E3
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44CFC046
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Jul 01 10:57:24 atreyu kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00120500
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x20080EC0
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Jul 01 10:57:25 atreyu kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon: Failed to deallocate virtual address for buffer:
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon:    size      : 4096 bytes
Jul 01 10:57:27 atreyu /usr/libexec/gdm-x-session[2592]: radeon:    va        : 0x11a6fd000
Jul 01 10:57:30 atreyu kernel: [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
Jul 01 10:57:30 atreyu kernel: [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BBC8 (len 237, WS 0, PS 4) @ 0xBBD6
Jul 01 10:57:30 atreyu kernel: [drm] PCIE gen 3 link speeds already enabled
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: Wait for MC idle timedout !
Jul 01 10:57:30 atreyu kernel: [drm] PCIE GART of 2048M enabled (table at 0x0000000000165000).
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: WB enabled
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10
Jul 01 10:57:30 atreyu kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_gfx' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_cp1' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_cp2' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_dma1' in directory '0' already present!
Jul 01 10:57:30 atreyu kernel: debugfs: File 'radeon_ring_dma2' in directory '0' already present!
Jul 01 10:57:31 atreyu kernel: [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x850C)=0xCAFEDEAD)
Jul 01 10:57:31 atreyu kernel: [drm:si_resume [radeon]] *ERROR* si startup failed on resume

and a long list of other error messages involving gdm-x-session, leading up to a kernel Oops.

The abrt message mentions these errors:

reason: WARNING: CPU: 4 PID: 2592 at drivers/gpu/drm/radeon/radeon_object.c:62 radeon_ttm_bo_destroy+0xde/0xf0 [radeon] [radeon]

backtrace:
WARNING: CPU: 4 PID: 2592 at drivers/gpu/drm/radeon/radeon_object.c:62 radeon_ttm_bo_destroy+0xde/0xf0 [radeon]
Modules linked in: cdc_acm tls tun ntfs3 rfcomm snd_seq_dummy snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT bridge stp llc xt_comment nf_nat_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast nft_objref nf_conntrack_tftp nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_syslog nft_log nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security vboxnetadp(OE) vboxnetflt(OE) ip_set nfnetlink ebtable_filter ebtables vboxdrv(OE) ip6table_filter iptable_filter qrtr bnep sunrpc binfmt_misc vfat fat squashfs loop intel_rapl_msr mei_pxp mei_wdt mei_hdcp iTCO_wdt ee1004 intel_pmc_bxt iTCO_vendor_support btusb dell_smm_hwmon btrtl btbcm btintel uvcvideo btmtk videobuf2_vmalloc videobuf2_memops bluetooth videobuf2_v4l2 intel_rapl_common
 snd_usb_audio videobuf2_common snd_usbmidi_lib videodev snd_hda_codec_realtek snd_rawmidi ecdh_generic intel_tcc_cooling rfkill mc x86_pkg_temp_thermal snd_hda_codec_generic snd_hda_codec_hdmi intel_powerclamp coretemp kvm_intel snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi pktcdvd snd_hda_codec kvm snd_hda_core dell_wmi irqbypass snd_hwdep ledtrig_audio rapl snd_seq intel_cstate snd_seq_device dell_smbios intel_uncore snd_pcm dcdbas intel_wmi_thunderbolt sparse_keymap wmi_bmof snd_timer dell_wmi_descriptor pcspkr mei_me snd i2c_i801 mei intel_pch_thermal i2c_smbus ie31200_edac soundcore acpi_pad zram amdgpu hid_logitech_hidpp iommu_v2 gpu_sched hid_logitech_dj wacom hid_multitouch ums_realtek i915 crct10dif_pclmul crc32_pclmul crc32c_intel radeon firewire_ohci ghash_clmulni_intel e1000e serio_raw firewire_core crc_itu_t drm_ttm_helper drm_buddy uas ttm usb_storage drm_dp_helper wmi video ip6_tables ip_tables analog gameport joydev ipmi_devintf ipmi_msghandler fuse
CPU: 4 PID: 2592 Comm: Xorg Tainted: G           OE     5.18.7-200.fc36.x86_64 #1
Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 2.18.1 07/09/2021
RIP: 0010:radeon_ttm_bo_destroy+0xde/0xf0 [radeon]
Code: 00 00 00 74 0f 48 8b b3 b0 01 00 00 48 89 df e8 c8 ee 25 fc 48 89 df e8 70 f9 24 fc 4c 89 e7 5b 5d 41 5c 41 5d e9 32 da cf fb <0f> 0b eb cd 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00
RSP: 0018:ffffb27f83af7ce8 EFLAGS: 00010283
RAX: ffff94bccd196270 RBX: ffff94bccd196078 RCX: ffff94bbaa3a8800
RDX: ffff94bacecf8480 RSI: ffff94bccd196000 RDI: ffff94bdfc1c1cc8
RBP: ffffffffffffffff R08: 0000000000000000 R09: 000000008020001f
R10: ffff94bc9fe76ba8 R11: ffffb27f83af7d20 R12: ffff94bccd196000
R13: ffff94baea98c058 R14: ffff94baea98c040 R15: 000000000000008f
FS:  00007f06bb428fc0(0000) GS:ffff94bdea500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000011109f109fb8 CR3: 000000043c2f0005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 radeon_bo_unref+0x1a/0x30 [radeon]
 radeon_gem_object_free+0x20/0x30 [radeon]
 drm_gem_object_release_handle+0x69/0x80
 ? drm_gem_handle_create+0x40/0x40
 drm_gem_handle_delete+0x59/0xa0
 ? drm_gem_handle_create+0x40/0x40
 drm_ioctl_kernel+0x9b/0x140
 drm_ioctl+0x21c/0x410
 ? drm_gem_handle_create+0x40/0x40
 ? ioctl_has_perm.constprop.0.isra.0+0xaa/0xf0
 radeon_drm_ioctl+0x49/0x80 [radeon]
 __x64_sys_ioctl+0x8a/0xc0
 do_syscall_64+0x58/0x80
 ? do_syscall_64+0x67/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f06bab0776f
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
RSP: 002b:00007ffc5f62de60 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00005638ec7693b0 RCX: 00007f06bab0776f
RDX: 00007ffc5f62df08 RSI: 0000000040086409 RDI: 0000000000000011
RBP: 00007ffc5f62df08 R08: 00007f06babf8410 R09: 00005638eb3e0780
R10: 0000000000000011 R11: 0000000000000246 R12: 0000000040086409
R13: 0000000000000011 R14: 000000011a360000 R15: 00005638ed66afe0
 </TASK>

crash function: radeon_bo_unref

This problem is relatively new. I generally keep this machine up to date. I ran dnf update yesterday and the problem occurred again this morning. I don’t remember seeing it a month ago. I’ve been traveling so I can’t be much more precise about the timing.

Same general question, not sure where to report this. My best guess is the radeon driver, but does that come under kernel, xorg-x11-drv-ati (which is what abrt suggests), or what?

Possibly related:

https://bugzilla.redhat.com/show_bug.cgi?id=2022980

https://bugzilla.redhat.com/show_bug.cgi?id=2089380 (I’m adding a link to this forum entry there)

https://bugzilla.redhat.com/show_bug.cgi?id=2091306

Did you fix the issue? Having the same problem now

Please do not continue posting in extraneous threads. Focus on the one you started about this issue.
https://discussion.fedoraproject.org/t/firefox-freezing-fedora/63365

On my present machine, I’ve had lockups in Fedora using Gnome over the years which are all very similar. The screen freezes but the mouse still works. If audio is playing, it will continue for a few minutes before ending. Same with a video call, it stays going for a few minutes then goes silent. I never thought to check journalctrl. I thought it was a Chrome bug as many times the issue would occur when opening a PDF tab in Chrome. And when I tried switching back to Firefox a year or two ago, the issue mostly went away. But it still occasionally happens. A few times in the last few days in fact. This morning it happened when going to open a Google Slides tab. Here’s the journalctrl output leading up to and the event itself (occurs at 14:36).

Oct 17 14:30:24 localhost.localdomain audit: BPF prog-id=113 op=LOAD
Oct 17 14:30:24 localhost.localdomain kernel: audit: type=1334 audit(1666035024.933:1195): prog-id=113 op=LOAD
Oct 17 14:30:24 localhost.localdomain systemd[1]: Starting Fingerprint Authentication Daemon...
Oct 17 14:30:25 localhost.localdomain systemd[1]: Started Fingerprint Authentication Daemon.
Oct 17 14:30:25 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:30:25 localhost.localdomain kernel: audit: type=1130 audit(1666035025.072:1196): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:30:25 localhost.localdomain NetworkManager[1116]: <info>  [1666035025.1786] agent-manager: agent[7a4d23084074f6ef,:1.94/org.gnome.Shell.NetworkAgent/1000]: agent registered
Oct 17 14:30:55 localhost.localdomain systemd[1]: fprintd.service: Deactivated successfully.
Oct 17 14:30:55 localhost.localdomain kernel: audit: type=1131 audit(1666035055.136:1197): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:30:55 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:30:55 localhost.localdomain audit: BPF prog-id=0 op=UNLOAD
Oct 17 14:30:55 localhost.localdomain kernel: audit: type=1334 audit(1666035055.148:1198): prog-id=0 op=UNLOAD
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: libva info: VA-API version 1.13.0
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: libva info: Found init function __vaDriverInit_1_13
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: ATTENTION: default value of option mesa_glthread overridden by environment.
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: libva info: va_openDriver() returns 0
Oct 17 14:36:34 localhost.localdomain apcupsd[2024]: Communications with UPS lost.
Oct 17 14:36:34 localhost.localdomain firefox.desktop[319039]: ATTENTION: default value of option mesa_glthread overridden by environment.
Oct 17 14:36:35 localhost.localdomain firefox.desktop[319039]: libva info: VA-API version 1.13.0
Oct 17 14:36:35 localhost.localdomain firefox.desktop[319039]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Oct 17 14:36:35 localhost.localdomain firefox.desktop[319039]: libva info: Found init function __vaDriverInit_1_13
Oct 17 14:36:35 localhost.localdomain firefox.desktop[319039]: ATTENTION: default value of option mesa_glthread overridden by environment.
Oct 17 14:36:35 localhost.localdomain firefox.desktop[319039]: libva info: va_openDriver() returns 0
Oct 17 14:36:46 localhost.localdomain kernel: radeon 0000:05:00.0: ring 0 stalled for more than 10385msec
Oct 17 14:36:46 localhost.localdomain kernel: radeon 0000:05:00.0: GPU lockup (current fence id 0x00000000002772ba last fence id 0x00000000002772c3 on ring 0)
Oct 17 14:37:25 localhost.localdomain kernel: sysrq: This sysrq operation is disabled.
Oct 17 14:37:26 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (EE) event4  - Logitech Logitech Illuminated Keyboard: client bug: event processing lagging behind by 24ms, your system is too slow
Oct 17 14:37:35 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (EE) event4  - Logitech Logitech Illuminated Keyboard: client bug: event processing lagging behind by 30ms, your system is too slow
Oct 17 14:37:43 localhost.localdomain kernel: sysrq: This sysrq operation is disabled.
Oct 17 14:37:54 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (EE) event4  - Logitech Logitech Illuminated Keyboard: client bug: event processing lagging behind by 24ms, your system is too slow
Oct 17 14:37:58 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:58 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:58 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:58 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:59 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:59 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:59 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:37:59 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:38:00 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:38:00 localhost.localdomain /usr/libexec/gdm-x-session[2337]: [dix] EventToCore: Not implemented yet
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "27"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event1  - Power Button: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "30"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event0  - Power Button: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "31"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event3  - PixArt USB Optical Mouse: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "32"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event2  - C-Media Electronics Inc. USB Audio Device: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "33"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event4  - Logitech Logitech Illuminated Keyboard: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "34"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "35"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event8  - Logitech Webcam C930e: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "36"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event7  - Eee PC WMI hotkeys: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (**) Option "fd" "34"
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) event5  - Logitech Logitech Illuminated Keyboard Consumer Control: device removed
Oct 17 14:38:04 localhost.localdomain /usr/libexec/gdm-x-session[2337]: (II) AIGLX: Suspending AIGLX clients for VT switch
Oct 17 14:38:09 localhost.localdomain kernel: usb 11-2: USB disconnect, device number 2
Oct 17 14:38:10 localhost.localdomain kernel: usb 11-2: new low-speed USB device number 3 using xhci_hcd
Oct 17 14:38:10 localhost.localdomain kernel: usb 11-2: New USB device found, idVendor=04ca, idProduct=008a, bcdDevice= 1.00
Oct 17 14:38:10 localhost.localdomain kernel: usb 11-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Oct 17 14:38:10 localhost.localdomain kernel: usb 11-2: Product: USB Optical Mouse
Oct 17 14:38:10 localhost.localdomain kernel: usb 11-2: Manufacturer: PixArt
Oct 17 14:38:10 localhost.localdomain kernel: input: PixArt USB Optical Mouse as /devices/pci0000:00/0000:00:07.0/0000:03:00.0/usb11/11-2/11-2:1.0/0003:04CA:008A.0005/input/input17
Oct 17 14:38:10 localhost.localdomain kernel: hid-generic 0003:04CA:008A.0005: input,hidraw1: USB HID v1.11 Mouse [PixArt USB Optical Mouse] on usb-0000:03:00.0-2/input0
Oct 17 14:38:10 localhost.localdomain mtp-probe[323891]: checking bus 11, device 3: "/sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/usb11/11-2"
Oct 17 14:38:10 localhost.localdomain mtp-probe[323891]: bus: 11, device: 3 was not an MTP device
Oct 17 14:38:10 localhost.localdomain upowerd[1756]: treating change event as add on /sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/usb11/11-2
Oct 17 14:38:11 localhost.localdomain mtp-probe[323898]: checking bus 11, device 3: "/sys/devices/pci0000:00/0000:00:07.0/0000:03:00.0/usb11/11-2"
Oct 17 14:38:11 localhost.localdomain mtp-probe[323898]: bus: 11, device: 3 was not an MTP device
Oct 17 14:38:11 localhost.localdomain kernel: usb 2-1: USB disconnect, device number 2
Oct 17 14:38:11 localhost.localdomain gsd-media-keys[2989]: Unable to get default source
Oct 17 14:38:12 localhost.localdomain kernel: usb 2-1: new high-speed USB device number 3 using ehci-pci
Oct 17 14:38:14 localhost.localdomain kernel: usb 5-3: USB disconnect, device number 3
Oct 17 14:38:15 localhost.localdomain kernel: usb 2-1: New USB device found, idVendor=046d, idProduct=0843, bcdDevice= 0.13
Oct 17 14:38:15 localhost.localdomain kernel: usb 2-1: New USB device strings: Mfr=0, Product=2, SerialNumber=1
Oct 17 14:38:15 localhost.localdomain kernel: usb 2-1: Product: Logitech Webcam C930e
Oct 17 14:38:15 localhost.localdomain kernel: usb 2-1: SerialNumber: E31D0A2E
Oct 17 14:38:15 localhost.localdomain kernel: usb 2-1: Found UVC 1.00 device Logitech Webcam C930e (046d:0843)
Oct 17 14:38:15 localhost.localdomain kernel: input: Logitech Webcam C930e as /devices/pci0000:00/0000:00:13.2/usb2/2-1/2-1:1.0/input/input18
Oct 17 14:38:15 localhost.localdomain kernel: usb 5-5: new full-speed USB device number 4 using ohci-pci
Oct 17 14:38:16 localhost.localdomain kernel: usb 5-5: New USB device found, idVendor=046d, idProduct=c318, bcdDevice=55.01
Oct 17 14:38:16 localhost.localdomain kernel: usb 5-5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Oct 17 14:38:16 localhost.localdomain kernel: usb 5-5: Product: Logitech Illuminated Keyboard
Oct 17 14:38:16 localhost.localdomain kernel: usb 5-5: Manufacturer: Logitech
Oct 17 14:38:16 localhost.localdomain kernel: input: Logitech Logitech Illuminated Keyboard as /devices/pci0000:00/0000:00:12.0/usb5/5-5/5-5:1.0/0003:046D:C318.0006/input/input19
Oct 17 14:38:16 localhost.localdomain kernel: hid-generic 0003:046D:C318.0006: input,hidraw2: USB HID v1.11 Keyboard [Logitech Logitech Illuminated Keyboard] on usb-0000:00:12.0-5/input0
Oct 17 14:38:16 localhost.localdomain kernel: input: Logitech Logitech Illuminated Keyboard Consumer Control as /devices/pci0000:00/0000:00:12.0/usb5/5-5/5-5:1.1/0003:046D:C318.0007/input/input20
Oct 17 14:38:16 localhost.localdomain kernel: hid-generic 0003:046D:C318.0007: input,hiddev96,hidraw3: USB HID v1.11 Device [Logitech Logitech Illuminated Keyboard] on usb-0000:00:12.0-5/input1
Oct 17 14:38:16 localhost.localdomain mtp-probe[323918]: checking bus 5, device 4: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-5"
Oct 17 14:38:16 localhost.localdomain mtp-probe[323919]: checking bus 2, device 3: "/sys/devices/pci0000:00/0000:00:13.2/usb2/2-1"
Oct 17 14:38:16 localhost.localdomain mtp-probe[323918]: bus: 5, device: 4 was not an MTP device
Oct 17 14:38:16 localhost.localdomain mtp-probe[323919]: bus: 2, device: 3 was not an MTP device
Oct 17 14:38:16 localhost.localdomain upowerd[1756]: treating change event as add on /sys/devices/pci0000:00/0000:00:12.0/usb5/5-5
Oct 17 14:38:16 localhost.localdomain systemd[1225]: Reached target Sound Card.
Oct 17 14:38:16 localhost.localdomain systemd[1230]: Reached target Sound Card.
Oct 17 14:38:16 localhost.localdomain systemd-logind[1063]: Watching system buttons on /dev/input/event8 (Logitech Logitech Illuminated Keyboard Consumer Control)
Oct 17 14:38:16 localhost.localdomain mtp-probe[323949]: checking bus 2, device 3: "/sys/devices/pci0000:00/0000:00:13.2/usb2/2-1"
Oct 17 14:38:16 localhost.localdomain mtp-probe[323949]: bus: 2, device: 3 was not an MTP device
Oct 17 14:38:16 localhost.localdomain systemd-logind[1063]: Watching system buttons on /dev/input/event5 (Logitech Logitech Illuminated Keyboard)
Oct 17 14:38:16 localhost.localdomain mtp-probe[323950]: checking bus 5, device 4: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-5"
Oct 17 14:38:16 localhost.localdomain mtp-probe[323950]: bus: 5, device: 4 was not an MTP device
Oct 17 14:38:21 localhost.localdomain kernel: r8169 0000:02:00.0 enp2s0: Link is Down
Oct 17 14:38:21 localhost.localdomain NetworkManager[1116]: <info>  [1666035501.1949] policy: set-hostname: current hostname was changed outside NetworkManager: 'localhost.localdomain'
Oct 17 14:38:22 localhost.localdomain kernel: usb 5-1: USB disconnect, device number 2
Oct 17 14:38:24 localhost.localdomain kernel: usb 5-1: new full-speed USB device number 5 using ohci-pci
Oct 17 14:38:24 localhost.localdomain kernel: usb 5-1: New USB device found, idVendor=0d8c, idProduct=0014, bcdDevice= 1.00
Oct 17 14:38:24 localhost.localdomain kernel: usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Oct 17 14:38:24 localhost.localdomain kernel: usb 5-1: Product: USB Audio Device
Oct 17 14:38:24 localhost.localdomain kernel: usb 5-1: Manufacturer: C-Media Electronics Inc.
Oct 17 14:38:24 localhost.localdomain kernel: cmedia_hs100b 0003:0D8C:0014.0008: Fixing CMedia HS-100B report descriptor
Oct 17 14:38:24 localhost.localdomain kernel: input: C-Media Electronics Inc. USB Audio Device as /devices/pci0000:00/0000:00:12.0/usb5/5-1/5-1:1.3/0003:0D8C:0014.0008/input/input22
Oct 17 14:38:25 localhost.localdomain kernel: cmedia_hs100b 0003:0D8C:0014.0008: input,hidraw0: USB HID v1.00 Device [C-Media Electronics Inc. USB Audio Device] on usb-0000:00:12.0-1/input3
Oct 17 14:38:25 localhost.localdomain mtp-probe[323967]: checking bus 5, device 5: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-1"
Oct 17 14:38:25 localhost.localdomain mtp-probe[323967]: bus: 5, device: 5 was not an MTP device
Oct 17 14:38:25 localhost.localdomain mtp-probe[323980]: checking bus 5, device 5: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-1"
Oct 17 14:38:25 localhost.localdomain mtp-probe[323980]: bus: 5, device: 5 was not an MTP device
Oct 17 14:38:26 localhost.localdomain kernel: usb 2-1: reset high-speed USB device number 3 using ehci-pci
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.1974] device (enp2s0): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed')
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.1976] dhcp4 (enp2s0): canceled DHCP transaction
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.1977] dhcp4 (enp2s0): state changed extended -> terminated
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.1981] dhcp6 (enp2s0): canceled DHCP transaction
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.1982] dhcp6 (enp2s0): state changed bound -> terminated
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.2016] policy: set-hostname: current hostname was changed outside NetworkManager: 'localhost.localdomain'
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Withdrawing address record for 10.0.0.62 on enp2s0.
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.2024] policy: set-hostname: current hostname was changed outside NetworkManager: 'localhost.localdomain'
Oct 17 14:38:27 localhost.localdomain acvpnagent[2104]: The network interface for the VPN connection has gone down.
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Leaving mDNS multicast group on interface enp2s0.IPv4 with address 10.0.0.62.
Oct 17 14:38:27 localhost.localdomain acvpnagent[2104]: IP addresses from active interfaces:
Oct 17 14:38:27 localhost.localdomain systemd-resolved[1033]: enp2s0: Bus client reset search domain list.
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Interface enp2s0.IPv4 no longer relevant for mDNS.
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.2057] manager: NetworkManager state is now CONNECTED_LOCAL
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Withdrawing address record for fe80::8aa5:2427:f74c:cac2 on enp2s0.
Oct 17 14:38:27 localhost.localdomain systemd-resolved[1033]: enp2s0: Bus client set default route setting: no
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Leaving mDNS multicast group on interface enp2s0.IPv6 with address fe80::8aa5:2427:f74c:cac2.
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.2067] manager: NetworkManager state is now DISCONNECTED
Oct 17 14:38:27 localhost.localdomain avahi-daemon[1849]: Interface enp2s0.IPv6 no longer relevant for mDNS.
Oct 17 14:38:27 localhost.localdomain NetworkManager[1116]: <info>  [1666035507.2078] policy: set-hostname: current hostname was changed outside NetworkManager: 'localhost.localdomain'
Oct 17 14:38:27 localhost.localdomain acvpnagent[2104]: Function: applyHostConfigForNoVpn File: ../../vpn/Agent/MainThread.cpp Line: 11738 No network interface is available, cannot determine potential public addresses.
Oct 17 14:38:27 localhost.localdomain acvpnagent[2104]: Current network state: No network interface
Oct 17 14:38:27 localhost.localdomain acvpnagent[2104]: Function: applyHostConfigForNoVpn File: ../../vpn/Agent/MainThread.cpp Line: 11738 No network interface is available, cannot determine potential public addresses.
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1325 audit(1666035507.207:1199): table=firewalld:9 family=1 entries=5 op=nft_unregister_rule pid=1039 subj=system_u:system_r:firewalld_t:s0 comm="firewalld"
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1300 audit(1666035507.207:1199): arch=c000003e syscall=46 success=yes exit=416 a0=6 a1=7ffd2a6dfbd0 a2=0 a3=7ffd2a6cea6c items=0 ppid=1 pid=1039 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="firewalld" exe="/usr/bin/python3.10" subj=system_u:system_r:firewalld_t:s0 key=(null)
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1327 audit(1666035507.207:1199): proctitle=2F7573722F62696E2F707974686F6E33002D73002F7573722F7362696E2F6669726577616C6C64002D2D6E6F666F726B002D2D6E6F706964
Oct 17 14:38:27 localhost.localdomain audit[1039]: NETFILTER_CFG table=firewalld:9 family=1 entries=5 op=nft_unregister_rule pid=1039 subj=system_u:system_r:firewalld_t:s0 comm="firewalld"
Oct 17 14:38:27 localhost.localdomain audit[1039]: SYSCALL arch=c000003e syscall=46 success=yes exit=416 a0=6 a1=7ffd2a6dfbd0 a2=0 a3=7ffd2a6cea6c items=0 ppid=1 pid=1039 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="firewalld" exe="/usr/bin/python3.10" subj=system_u:system_r:firewalld_t:s0 key=(null)
Oct 17 14:38:27 localhost.localdomain audit: PROCTITLE proctitle=2F7573722F62696E2F707974686F6E33002D73002F7573722F7362696E2F6669726577616C6C64002D2D6E6F666F726B002D2D6E6F706964
Oct 17 14:38:27 localhost.localdomain gnome-software[3074]: not GsPlugin error pk-control-error-quark:0: Could not activate remote peer: activation request failed: unit is masked.
Oct 17 14:38:27 localhost.localdomain systemd-resolved[1033]: enp2s0: Bus client reset DNS server list.
Oct 17 14:38:27 localhost.localdomain gnome-software[3074]: not handling error failed for action get-updates: Could not activate remote peer: activation request failed: unit is masked.
Oct 17 14:38:27 localhost.localdomain gnome-software[3074]: not GsPlugin error pk-control-error-quark:0: Could not activate remote peer: activation request failed: unit is masked.
Oct 17 14:38:27 localhost.localdomain gnome-software[3074]: not handling error failed for action get-updates: Could not activate remote peer: activation request failed: unit is masked.
Oct 17 14:38:27 localhost.localdomain systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 17 14:38:27 localhost.localdomain systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 17 14:38:27 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1130 audit(1666035507.246:1200): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain systemd[1]: Stopping Sendmail Mail Transport Client...
Oct 17 14:38:27 localhost.localdomain systemd[1]: sm-client.service: Deactivated successfully.
Oct 17 14:38:27 localhost.localdomain systemd[1]: Stopped Sendmail Mail Transport Client.
Oct 17 14:38:27 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sm-client comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain systemd[1]: Stopping Sendmail Mail Transport Agent...
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1131 audit(1666035507.270:1201): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sm-client comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain systemd[1]: sendmail.service: Deactivated successfully.
Oct 17 14:38:27 localhost.localdomain systemd[1]: Stopped Sendmail Mail Transport Agent.
Oct 17 14:38:27 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sendmail comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1131 audit(1666035507.273:1202): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sendmail comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain systemd[1]: Starting Sendmail Mail Transport Agent...
Oct 17 14:38:27 localhost.localdomain systemd[1]: squid.service: Unit cannot be reloaded because it is inactive.
Oct 17 14:38:27 localhost.localdomain sendmail[324004]: starting daemon (8.17.1): SMTP+queueing@01:00:00
Oct 17 14:38:27 localhost.localdomain systemd[1]: sendmail.service: Can't open PID file /run/sendmail.pid (yet?) after start: Operation not permitted
Oct 17 14:38:27 localhost.localdomain systemd[1]: Started Sendmail Mail Transport Agent.
Oct 17 14:38:27 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sendmail comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1130 audit(1666035507.325:1203): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sendmail comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain systemd[1]: Starting Sendmail Mail Transport Client...
Oct 17 14:38:27 localhost.localdomain sm-msp-queue[324024]: starting daemon (8.17.1): queueing@01:00:00
Oct 17 14:38:27 localhost.localdomain systemd[1]: Started Sendmail Mail Transport Client.
Oct 17 14:38:27 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sm-client comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain kernel: audit: type=1130 audit(1666035507.370:1204): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sm-client comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Oct 17 14:38:27 localhost.localdomain kernel: usb 5-1: USB disconnect, device number 5
Oct 17 14:38:28 localhost.localdomain kernel: r8169 0000:02:00.0 enp2s0: Link is Up - 1Gbps/Full - flow control off
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.6994] device (enp2s0): carrier: link connected
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7004] device (enp2s0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7024] policy: auto-activating connection 'Wired connection 1' (1322237d-5eaa-359e-a43e-65622029aad7)
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7034] device (enp2s0): Activation: starting connection 'Wired connection 1' (1322237d-5eaa-359e-a43e-65622029aad7)
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7036] device (enp2s0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7042] manager: NetworkManager state is now CONNECTING
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7045] device (enp2s0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Oct 17 14:38:28 localhost.localdomain audit[1039]: NETFILTER_CFG table=firewalld:10 family=1 entries=5 op=nft_register_rule pid=1039 subj=system_u:system_r:firewalld_t:s0 comm="firewalld"
Oct 17 14:38:28 localhost.localdomain kernel: audit: type=1325 audit(1666035508.714:1205): table=firewalld:10 family=1 entries=5 op=nft_register_rule pid=1039 subj=system_u:system_r:firewalld_t:s0 comm="firewalld"
Oct 17 14:38:28 localhost.localdomain kernel: audit: type=1300 audit(1666035508.714:1205): arch=c000003e syscall=46 success=yes exit=1188 a0=6 a1=7ffd2a6df990 a2=0 a3=7ffd2a6ce82c items=0 ppid=1 pid=1039 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="firewalld" exe="/usr/bin/python3.10" subj=system_u:system_r:firewalld_t:s0 key=(null)
Oct 17 14:38:28 localhost.localdomain audit[1039]: SYSCALL arch=c000003e syscall=46 success=yes exit=1188 a0=6 a1=7ffd2a6df990 a2=0 a3=7ffd2a6ce82c items=0 ppid=1 pid=1039 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="firewalld" exe="/usr/bin/python3.10" subj=system_u:system_r:firewalld_t:s0 key=(null)
Oct 17 14:38:28 localhost.localdomain audit: PROCTITLE proctitle=2F7573722F62696E2F707974686F6E33002D73002F7573722F7362696E2F6669726577616C6C64002D2D6E6F666F726B002D2D6E6F706964
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7176] device (enp2s0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Oct 17 14:38:28 localhost.localdomain NetworkManager[1116]: <info>  [1666035508.7183] dhcp4 (enp2s0): activation: beginning transaction (timeout in 45 seconds)
Oct 17 14:38:28 localhost.localdomain avahi-daemon[1849]: Joining mDNS multicast group on interface enp2s0.IPv6 with address fe80::8aa5:2427:f74c:cac2.
Oct 17 14:38:28 localhost.localdomain avahi-daemon[1849]: New relevant interface enp2s0.IPv6 for mDNS.
Oct 17 14:38:28 localhost.localdomain avahi-daemon[1849]: Registering new address record for fe80::8aa5:2427:f74c:cac2 on enp2s0.*.
Oct 17 14:38:28 localhost.localdomain acvpnagent[2104]: A new network interface has been detected.
Oct 17 14:38:28 localhost.localdomain acvpnagent[2104]: IP addresses from active interfaces: enp2s0: FE80:0:0:0:8AA5:2427:F74C:CAC2
Oct 17 14:38:29 localhost.localdomain kernel: usb 5-1: new full-speed USB device number 6 using ohci-pci
Oct 17 14:38:29 localhost.localdomain kernel: usb 5-1: New USB device found, idVendor=0d8c, idProduct=0014, bcdDevice= 1.00
Oct 17 14:38:29 localhost.localdomain kernel: usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Oct 17 14:38:29 localhost.localdomain kernel: usb 5-1: Product: USB Audio Device
Oct 17 14:38:29 localhost.localdomain kernel: usb 5-1: Manufacturer: C-Media Electronics Inc.
Oct 17 14:38:29 localhost.localdomain kernel: cmedia_hs100b 0003:0D8C:0014.0009: Fixing CMedia HS-100B report descriptor
Oct 17 14:38:29 localhost.localdomain kernel: input: C-Media Electronics Inc. USB Audio Device as /devices/pci0000:00/0000:00:12.0/usb5/5-1/5-1:1.3/0003:0D8C:0014.0009/input/input23
Oct 17 14:38:29 localhost.localdomain kernel: cmedia_hs100b 0003:0D8C:0014.0009: input,hidraw0: USB HID v1.00 Device [C-Media Electronics Inc. USB Audio Device] on usb-0000:00:12.0-1/input3
Oct 17 14:38:29 localhost.localdomain mtp-probe[324041]: checking bus 5, device 6: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-1"
Oct 17 14:38:29 localhost.localdomain mtp-probe[324041]: bus: 5, device: 6 was not an MTP device
Oct 17 14:38:29 localhost.localdomain mtp-probe[324052]: checking bus 5, device 6: "/sys/devices/pci0000:00/0000:00:12.0/usb5/5-1"
Oct 17 14:38:29 localhost.localdomain mtp-probe[324052]: bus: 5, device: 6 was not an MTP device
Oct 17 14:38:30 localhost.localdomain NetworkManager[1116]: <info>  [1666035510.7360] dhcp4 (enp2s0): state changed unknown -> bound, address=10.0.0.62
Oct 17 14:38:30 localhost.localdomain avahi-daemon[1849]: Joining mDNS multicast group on interface enp2s0.IPv4 with address 10.0.0.62.
Oct 17 14:38:30 localhost.localdomain avahi-daemon[1849]: New relevant interface enp2s0.IPv4 for mDNS.
Oct 17 14:38:30 localhost.localdomain avahi-daemon[1849]: Registering new address record for 10.0.0.62 on enp2s0.IPv4.
Oct 17 14:38:30 localhost.localdomain acvpnagent[2104]: A new network interface address has been detected.
Oct 17 14:38:30 localhost.localdomain acvpnagent[2104]: IP addresses from active interfaces: enp2s0: 10.0.0.62, FE80:0:0:0:8AA5:2427:F74C:CAC2
Oct 17 14:38:30 localhost.localdomain NetworkManager[1116]: <info>  [1666035510.7385] device (enp2s0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')```

And here’s today’s freeze. Happened when opening a link (List of LaTeX symbols | LaTeX Wiki | Fandom) which had a youtube video / ad starting in the bottom right corner of the page.

journalctl:

Oct 18 21:47:33 localhost.localdomain cupsd[1140]: REQUEST localhost - - "POST / HTTP/1.1" 200 183 Renew-Subscription successful-ok
Oct 18 21:50:05 localhost.localdomain audit: BPF prog-id=78 op=LOAD
Oct 18 21:50:05 localhost.localdomain kernel: audit: type=1334 audit(1666147805.680:1099): prog-id=78 op=LOAD
Oct 18 21:50:05 localhost.localdomain systemd[1]: Starting Fingerprint Authentication Daemon...
Oct 18 21:50:05 localhost.localdomain systemd[1]: Started Fingerprint Authentication Daemon.
Oct 18 21:50:05 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=su>
Oct 18 21:50:05 localhost.localdomain kernel: audit: type=1130 audit(1666147805.824:1100): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostna>
Oct 18 21:50:05 localhost.localdomain NetworkManager[1124]: <info>  [1666147805.9084] agent-manager: agent[d9efffa03db16335,:1.94/org.gnome.Shell.NetworkAgent/1000]: agent registered
Oct 18 21:50:20 localhost.localdomain apcupsd[1798]: Communications with UPS lost.
Oct 18 21:50:36 localhost.localdomain systemd[1]: fprintd.service: Deactivated successfully.
Oct 18 21:50:36 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=suc>
Oct 18 21:50:36 localhost.localdomain kernel: audit: type=1131 audit(1666147836.039:1101): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostna>
Oct 18 21:50:36 localhost.localdomain audit: BPF prog-id=0 op=UNLOAD
Oct 18 21:50:36 localhost.localdomain kernel: audit: type=1334 audit(1666147836.053:1102): prog-id=0 op=UNLOAD
Oct 18 21:54:54 localhost.localdomain firefox.desktop[80629]: libva info: VA-API version 1.13.0
Oct 18 21:54:54 localhost.localdomain firefox.desktop[80629]: libva info: Trying to open /usr/lib64/dri/r600_drv_video.so
Oct 18 21:54:54 localhost.localdomain firefox.desktop[80629]: libva info: Found init function __vaDriverInit_1_13
Oct 18 21:54:54 localhost.localdomain firefox.desktop[80629]: ATTENTION: default value of option mesa_glthread overridden by environment.
Oct 18 21:54:54 localhost.localdomain firefox.desktop[80629]: libva info: va_openDriver() returns 0
-- Boot d2d46a81a0a94362a862e96238d70bf0 -- 

I created Issue in Freedesktop drm/amd tracker: https://gitlab.freedesktop.org/drm/amd/-/issues/2252.

This is specifically for ‘GPU lockup’ error like:

radeon 0000:01:00.0: ring 3 stalled for more than 29736msec
radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000047c4a last fence id 0x0000000000047c68 on ring 3)

If you got into such an error please add comment in the linked Issue. The more comments the more chances to get fix.

3 Likes

Hi folks (and thank you @foxhound for filing reports upstream… that was part of my question, why to actually file them!),

I never really found a solution for this problem, and if I could buy a cheap Intel Arc to be freed from this misery, I would in a heartbeat. But they’ve been out of stock everywhere for months, so we have to deal with what we have.

If you folks recall my posts above, the key element I had determined to multiply tenfold the risk of such lockups, some months ago, is the fact that the system has been suspended (S3 sleep) and resumed.

Therefore, for the past 6 months or so, I have simply disabled automatic suspend on my machine (in the GNOME Control Center power management settings). I just leave the computer turned on and running idle all the time, which helps noticeably with stability, although it is a huge energy waste :face_vomiting: Because my productivity and sanity is worth much more than a few kilowatts.

At the same time, it seems to have become more stable after upgrading from Fedora 35 to Fedora 37, as I don’t actually recall having experienced crashes with video playback in Firefox 107+ or other actions (such as opening websites with Epiphany) in December and January, but that might just be coincidence or placebo effect, and might be due to the fact that I’ve stopped suspending and that VA-API for H.264 has been disabled by default in Fedora 37 for AMD graphics (until a rpmfusion package is readied at least)…

… but I can never be 100% sure the issue is gone. After all, I’ve been seeing that error message for years.

To avoid poking the bear again, I have still not reactivated automatic suspending on that computer, because of the productivity-wrecking consequences of this eternal GPU lockup / fence stalling bug. I am not sure I wish to risk enabling the full VA-API-capable version of Mesa either, but maybe this is also a variable that people here can test?

Today I have encountered a regression in Mesa 23.1.x (compared to Mesa 23.0.x) in Fedora 38 with old onboard AMD graphics, that I reported over there.

Those versions of Mesa cause the “ring stalled” issue to occur even on a fresh boot, without doing anything more than logging into GNOME Shell and waiting a few seconds (or interacting with UI elements), no suspend/resume and no VA-API video decoding required to trigger the bug.

Some people around me have mocked me for still running old graphics cards, and they are claiming that recent AMD GPUs haven’t been causing such crashes in years, but I somehow find that hard to believe. The rule with Linux graphics drivers used to be “don’t buy something too recent, it will not be supported”… though folks here have recommendations for inexpensive (~100$?) PCI-E AMD GPUs that can be purchased new and retrofitted into old desktop computers, and can confirm they are much more stable, I’d be willing to just throw money at the problem to avoid those pervasive freezes… if it is known to work better with brand new hardware.

Note that you are repeatedly posting on threads that were ancient (this one is about fedora 35, is a year old and the last two posts were by you at 3 months and 8 months after the last previous post)

Unless you have very good reason to post on a necro and inactive thread (and one on an os that is nearly a year EOL) it is much better and more likely to have productive responses if you were to start a new thread for the fedora version you are using and the specific errors you are seeing.

I do not get fully what is your point here. You had wrote about intergrated graphics of the laptop and then jumped to discrete desktop PCIE ones.
Anyway in case of discrete PCIE graphic cards I swiched to RX460 and RX550 now that I got for $25 and $35 respectively. Then as long as you are not a gamer those works great under Fedora 38. RX550 is really low power that lm_sensors shows 4[W] and its fan is not spinning for most the time.