Intel HD 4600 gpu hang

So, I’m having a really nasty issue, and I’m beginning to wonder if it’s a bug with Intel’s kernel drivers, or Xorgs.

When booting Fedora 35 normally, the system hangs so completely that I can’t even use the reset button on my motherboard about half the time. However, if I add “nomodeset” to the kernel parameters, system boots ok, with the obvious caveat that the GPU is not being used at all. So, with that out of the way, I found this via “journalctl”:

Mar 24 14:44:49 localhost-live dbus-broker[819]: A security policy denied :1.34 to send method call /org/freedesktop/PackageK>
Mar 24 14:44:49 localhost-live dbus-broker[819]: A security policy denied :1.34 to send method call /org/freedesktop/PackageK>
Mar 24 14:44:49 localhost-live realmd[1364]: Loaded settings from: /usr/lib/realmd/realmd-defaults.conf /usr/lib/realmd/realm>
Mar 24 14:44:49 localhost-live realmd[1364]: holding daemon: startup
Mar 24 14:44:49 localhost-live realmd[1364]: starting service
Mar 24 14:44:49 localhost-live realmd[1364]: connected to bus
Mar 24 14:44:49 localhost-live realmd[1364]: GLib-GIO: _g_io_module_get_default: Found default implementation local (GLocalVf>
Mar 24 14:44:49 localhost-live realmd[1364]: released daemon: startup
Mar 24 14:44:49 localhost-live realmd[1364]: claimed name on bus: org.freedesktop.realmd
Mar 24 14:44:49 localhost-live systemd[1]: Started Realm and Domain Configuration.
Mar 24 14:44:49 localhost-live audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init>
Mar 24 14:44:49 localhost-live systemd[1]: Starting Manage, Install and Generate Color Profiles...
Mar 24 14:44:49 localhost-live /usr/libexec/gdm-wayland-session[1047]: dbus-daemon[1047]: [session uid=42 pid=1047] Activatin>
Mar 24 14:44:49 localhost-live /usr/libexec/gdm-wayland-session[1047]: dbus-daemon[1047]: [session uid=42 pid=1047] Activatin>
Mar 24 14:44:49 localhost-live /usr/libexec/gdm-wayland-session[1047]: dbus-daemon[1047]: [session uid=42 pid=1047] Successfu>
Mar 24 14:44:49 localhost-live spice-vdagent[1383]: vdagent virtio channel /dev/virtio-ports/com.redhat.spice.0 does not exis>
Mar 24 14:44:49 localhost-live gnome-session-binary[1048]: Entering running state
Mar 24 14:44:50 localhost-live systemd[1]: Started Manage, Install and Generate Color Profiles.
Mar 24 14:44:50 localhost-live audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init>
Mar 24 14:44:53 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:44:53 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
Mar 24 14:44:56 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:44:56 localhost-live kernel: i915 0000:00:02.0: [drm] *ERROR* failed to set rcs0 head to zero ctl 00000000 head 000>
Mar 24 14:44:56 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
Mar 24 14:44:59 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:44:59 localhost-live kernel: i915 0000:00:02.0: [drm] *ERROR* failed to set rcs0 head to zero ctl 00000000 head 000>
Mar 24 14:44:59 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
Mar 24 14:45:02 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:45:02 localhost-live kernel: i915 0000:00:02.0: [drm] *ERROR* failed to set rcs0 head to zero ctl 00000000 head 000>
Mar 24 14:45:02 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
Mar 24 14:45:05 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:45:05 localhost-live kernel: i915 0000:00:02.0: [drm] *ERROR* failed to set rcs0 head to zero ctl 00000000 head 000>
Mar 24 14:45:05 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0
Mar 24 14:45:08 localhost-live kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 7:0:00000000
Mar 24 14:45:08 localhost-live kernel: i915 0000:00:02.0: [drm] *ERROR* failed to set rcs0 head to zero ctl 00000000 head 000>
Mar 24 14:45:08 localhost-live kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for stopped heartbeat on rcs0

Seems the gpu is hanging upon loading into gnome. Now, keep in mind, I’ve also had this issue (and variants of this exact issue) on other linux distros, so maybe this is something upstream that’s causing it.

That also being said, I did try to pull a Xorg.X.log from /var/log, and there are only logs that are generated on successful boots with nomodeset, so Xorg is crashing before it can generate any logs whatsoever.

I have tried following the Intel Graphics Arch Linux wiki, and enabled “intel_idle.max_cstate=1 i915.enable_dc=0 ahci.mobile_lpm_policy=1” in various combinations, and on any kernel newer than 4.18 (tested in Red Hat clones), the same crash/hard freeze occurs.

Here’s my hardware specs for reference:

Gigabyte Z87X-UD5H motherboard, using F10e BIOS (Also happens with F9 as well, stock settings)
Intel Core I5-4670k, stock clocks
24GB of GSkill DDR3-2133 RAM, set currently to XMP mode (issue still occurs regardless of RAM setting)
No discrete GPU.
3 HDD’s (mix of 1 Firecuda, 1 ST320, and ST3200 series), running in AHCI mode.

INXI output:

CPU: quad core Intel Core i5-4670K (-MCP-) speed/min/max: 1021/800/3800 MHz
Kernel: 5.16.16-200.fc35.x86_64 x86_64 Up: 39m Mem: 2196.7/23921.4 MiB (9.2%)
Storage: 3.94 TiB (0.1% used) Procs: 589 Shell: Bash inxi: 3.3.13

If anyone can assist with this, it would be greatly appreciated.

Edit: I managed to ssh into the system while hung, and ive attached the resulting dmesg output live.

Save dmesg output and content of /sys/class/drm/card0/error if you do it again.

What range of various versions of mesa and gnome (or other desktop environments) have you tested with those other distros?

Figured it out. Piped dmesg directly to nano, saved the output, and used fpaste to dropkick the output here: https://paste.centos.org/view/7a7bd160

/sys/class/drm/error doesn’t exist. All there is is /sys/class/drm/version.

Versions of Mesa libraries used on other distros, keep in mind, I’ve only gotten 2 distros to install to hdd with nomodeset, so almost all of these are live isos:

MXLinux: 20.3.5, failed with and without kernel arguments, hard freeze after 2 minutes of use, no logs written, journalctl produced no errors

Manajaro, multiple editions: 21.3.7, hard freezes when loading Xorg, doesn’t even make it to display manager, with the XFCE edition i was able to run journalctl, and got the same exact messages there, with slightly different addresses. If you want, I can post those screenshots as well.

Ubuntu 20.04: 20.0.4, I believe, system hard freeze so hard that even the reset button on the motherboard fails to work.

EndeavorOS: 21.3.1, hard freezes as soon as XFCE loads, mouse moves. I managed to get the same identical error messages here as in Manjaro.

Rocky Linux/Alma Linux: passing intel_idle.max_cstate=1 i915.enable_dc=0 to the kernel, works perfectly here, though there are some unresolved stutters and minor glitches (DRI:3 is the default mode for this, probably to blame).

Also, I can do it right now, if needed. Fedora is installed on a separate drive from my windows install, so its a breeze to swap. Pain to time my ssh correctly before hard freeze, but doable.

What kills me is that MesaGL is still perfectly functional, even with nomodeset passed to the kernel. Glxinfo shows the correct opengl args for this igpu, and glxgears runs at a crispy 2-3k fps with no issues. That may just be because its running through llvmpipe.

You should be able to just e.g. dmesg > hung_dmesg or dmesg > /home/YOUR_USERNAME/hung_dmesg and copy it later, after booting with nomodset. Same with /sys/class/drm/card0/error (with card0), which should be containg a GPU dump only after it hangs.

I’d suspect a hardware defect, but if it’s working fine on windows, it’s less likely.

Yeah it’s 100% working on Windows 10, stress tested with Aida64, Prime95, and plenty of gaming. Have even successfully overclocked the iGPU to 1350 mhz for that side only to very nice improvements.

I’ll grab those for ya, one moment.

SSH failed to work after I was able to get it working a whopping 2 times, it now just hard freezes before I can connect. that being said, /sys/class/drm/card0 was empty, so no logs were written when the crash occurred. Here’s the full dmesg output I managed to get:

[ 0.000000] microcode: microcode updated early to revision 0x28, date = 2019-1 - Pastebin.com

[i915]Intel HD 4600 plagued with crashes and GPU Hang messages (#5432) · Issues · drm / intel · GitLab

I’ve gone ahead and reported it upstream to the freedesktop.org devs, as I suspect since this occurs on multple distros, multiple versions of MESA, and multiple kernels, something upstream broke support for the HD4600(GT2). Will keep this discussion updated as needed.

Please note that pastebin defaults to 24 hrs life for your post so nothing of what you posted is now accessible. Please use a different location for storage of large items that are pertinent to your issue and should be available for long term.

In most cases you can select the pertinent data from the file and post that in the </> Preformatted text tags here so it is available as long as the thread is available.

The pertinent text is listed at : [i915]Intel HD 4600 plagued with crashes and GPU Hang messages (#5432) · Issues · drm / intel · GitLab

Thank you.