System freezing and unresponsive after upgrading to kernel 5.19

Hi all,

After upgrading my laptop [ ASUS ROG Flow Z13 ] recently from 5.18.18 to 5.19.4, I’ve observed that the device is extremely unresponsive and laggy - as if something was maxing out the CPU (but the CPU usage itself is normal). GDM takes over 30 seconds just to display my username, after logging on the activities overview and other animations takes several seconds to activate and there’s even lag on the terminal when entering commands and even receiving the output.

At first it seemed like it could be related to the graphics drivers since I had the nVidia proprietary drivers installed and glxgears wouldn’t even run (no errors, just freezes the system), so I uninstalled everything related to nvidia (dnf erase *nvidia*), ensured that the intel drivers (xorg-x11-drv-intel) were installed, and even blacklisted nouveau via GRUB command line so the system would only use Intel drivers - but this didn’t resolve the issue. Switching between Xorg and Wayland made no difference either.

However, when I boot back into 5.18.18, it works just fine - with or without the nVidia drivers - so it looks like a possible regression in 5.19. I have no other proprietary drivers or kernel modules installed btw (except for multimedia codecs).

journalctl shows a bunch of errors but these were present in 5.18 as well, so I’m at a loss on how to proceed further.

Any help would be much appreciated.

Specifications:

System: ASUS ROG Flow Z13 GZ301ZC
CPU: Intel Core i7-12700H
GPU 1: Intel Alder Lake-P Integrated Graphics
GPU 2: nVidia GeForce RTX 3050 Mobile GA107M
DE: GNOME v42.4

lspci -nnk:

0000:00:00.0 Host bridge [0600]: Intel Corporation 12th Gen Core Processor Host Bridge/DRAM Registers [8086:4641] (rev 02)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: igen6_edac
	Kernel modules: igen6_edac
0000:00:01.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 [8086:460d] (rev 02)
	Kernel driver in use: pcieport
0000:00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:46a6] (rev 0c)
	DeviceName: Second VGA
	Subsystem: ASUSTeK Computer Inc. Device [1043:1a2c]
	Kernel driver in use: i915
	Kernel modules: i915
0000:00:04.0 Signal processing controller [1180]: Intel Corporation Alder Lake Innovation Platform Framework Processor Participant [8086:461d] (rev 02)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: proc_thermal_pci
	Kernel modules: processor_thermal_device_pci
0000:00:06.0 System peripheral [0880]: Intel Corporation RST VMD Managed Controller [8086:09ab]
0000:00:07.0 PCI bridge [0604]: Intel Corporation Alder Lake-P Thunderbolt 4 PCI Express Root Port #0 [8086:466e] (rev 02)
	Kernel driver in use: pcieport
0000:00:08.0 System peripheral [0880]: Intel Corporation 12th Gen Core Processor Gaussian & Neural Accelerator [8086:464f] (rev 02)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
0000:00:0a.0 Signal processing controller [1180]: Intel Corporation Platform Monitoring Technology [8086:467d] (rev 01)
	Kernel driver in use: intel_vsec
	Kernel modules: intel_vsec
0000:00:0d.0 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 USB Controller [8086:461e] (rev 02)
	Kernel driver in use: xhci_hcd
0000:00:0d.2 USB controller [0c03]: Intel Corporation Alder Lake-P Thunderbolt 4 NHI #0 [8086:463e] (rev 02)
	Subsystem: Device [2222:1111]
	Kernel driver in use: thunderbolt
	Kernel modules: thunderbolt
0000:00:0e.0 RAID bus controller [0104]: Intel Corporation Volume Management Device NVMe RAID Controller [8086:467f]
	Subsystem: Intel Corporation Device [8086:0000]
	Kernel driver in use: vmd
	Kernel modules: vmd
0000:00:14.0 USB controller [0c03]: Intel Corporation Alder Lake PCH USB 3.2 xHCI Host Controller [8086:51ed] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:201f]
	Kernel driver in use: xhci_hcd
0000:00:14.2 RAM memory [0500]: Intel Corporation Alder Lake PCH Shared SRAM [8086:51ef] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
0000:00:14.3 Network controller [0280]: Intel Corporation Alder Lake-P PCH CNVi WiFi [8086:51f0] (rev 01)
	DeviceName: WLAN
	Subsystem: Intel Corporation Device [8086:0094]
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
0000:00:15.0 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #0 [8086:51e8] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: intel-lpss
0000:00:15.1 Serial bus controller [0c80]: Intel Corporation Alder Lake PCH Serial IO I2C Controller #1 [8086:51e9] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: intel-lpss
0000:00:16.0 Communication controller [0780]: Intel Corporation Alder Lake PCH HECI Controller [8086:51e0] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: mei_me
	Kernel modules: mei_me
0000:00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:51be] (rev 01)
	Kernel driver in use: pcieport
0000:00:1f.0 ISA bridge [0601]: Intel Corporation Alder Lake PCH eSPI Controller [8086:5182] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
0000:00:1f.3 Audio device [0403]: Intel Corporation Alder Lake PCH-P High Definition Audio Controller [8086:51c8] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel, snd_sof_pci_intel_tgl
0000:00:1f.4 SMBus [0c05]: Intel Corporation Alder Lake PCH-P SMBus Host Controller [8086:51a3] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801
0000:00:1f.5 Serial bus controller [0c80]: Intel Corporation Alder Lake-P PCH SPI Controller [8086:51a4] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1c42]
0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA107M [GeForce RTX 3050 Mobile] [10de:25a2] (rev a1)
	DeviceName: VGA
	Subsystem: ASUSTeK Computer Inc. Device [1043:1a2c]
	Kernel driver in use: nouveau
	Kernel modules: nouveau
0000:01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:2291] (rev a1)
	Subsystem: ASUSTeK Computer Inc. Device [1043:1a2c]
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
0000:30:00.0 SD Host controller [0805]: Genesys Logic, Inc GL9755 SD Host Controller [17a0:9755] (rev 01)
	Subsystem: ASUSTeK Computer Inc. Device [1043:202f]
	Kernel driver in use: sdhci-pci
	Kernel modules: sdhci_pci
10000:e0:06.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 [8086:464d] (rev 02)
	Kernel driver in use: pcieport
10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp Device [15b7:5026]
	Subsystem: Sandisk Corp Device [15b7:5026]
	Kernel driver in use: nvme
	Kernel modules: nvme

lsusb:

Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 007: ID 0b05:18c6 ASUSTek Computer, Inc. N-KEY Device
Bus 003 Device 006: ID 13d3:5492 IMC Networks USB2.0 HD UVC WebCam
Bus 003 Device 005: ID 04f3:0c6e Elan Microelectronics Corp. ELAN:Fingerprint
Bus 003 Device 004: ID 0b05:1a30 ASUSTek Computer, Inc. N-KEY Device
Bus 003 Device 003: ID 0416:5020 Winbond Electronics Corp. USB Device
Bus 003 Device 002: ID 0c45:636d Microdia USB 2.0 Camera
Bus 003 Device 008: ID 8087:0033 Intel Corp. 
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

There is a problem with a regression in kernel 5.19.4-200. You need blacklist asus_ec_sensors module or boot with 5.18.19-200.

Cheers for the reply. I don’t have asus_ec_sensor - instead, I’ve got asus_wmi and asus_nb_wmi - which I blacklisted, but unfortunately it didn’t make any difference.

I’ve switched back to 5.18 for now, but if this is really a regression, would be nice if someone could guide me with collecting and submitting a proper bug report, because 6.0 rc3 wouldn’t even boot on my machine so I doubt this would be fixed unless someone submits a bug report.

The maintainer is trying a patch.

1 Like

Just tried 6.0.0-0.rc5 and the issue is still present. Does anyone have a link to the bugreport so that I can keep a track of the progress?

I think I recently saw a note in another thread that kernel 5.19.9 solves this issue. The fix may not have made it into rawhide (6.0) kernels yet.

Thanks, I just tried 5.19.9-200 but can confirm the issue is still present. :frowning:
Do you have a link to the bug report please? I suspect my issue isn’t related to the sensor module issue the other commenter was talking about.

Hi Dexter,

I saw your report on Bodhi. Do I understand you correctly that the regression test fails in some tests? Both with and without nvidia?

First, do you use other external kernel drivers/modules except nvidia? If you are unsure, try without nvidia and check cat /proc/sys/kernel/tainted → if this outputs 0, just proceed with the following (otherwise, let us know the output):

I suggest to create a bug report (preferably, do it without nvidia, so in the condition that contains tainted = 0).

Contain a description of the overall problem (performance, reboot freeze, and so on), note since when (kernel) it happens and additionally provide the following information:

  • Logs of the test suites (both ./runtests.sh and ./runtests.sh -t performance)
  • kernel logs: journalctl --no-hostname -k > dmesg.txt → feel free to anonymize things like MAC addresses or user names if you perceive this to be private.
  • detailed hardware information

https://bugzilla.redhat.com → file the bug against the component “kernel”; the bug report maybe contains further questions that have to be answered. If you have questions, let us know.

Maybe the component will be changed by the assignee later, but for now, especially with failed kernel regression tests, it makes sense to start with the assumption that the kernel itself is the problem.

Also, let us know the bug report number here if possible.

Same here with both 5.19.4 and 5.19.9-200. The system is so unresponsive that I can do nothing…

I am using Nvidia proprietary driver:

VGA compatible controller: Intel Corporation Alder Lake-P Integrated Graphics Controller (rev 0c)

3D controller: NVIDIA Corporation GA107M [GeForce RTX 2050] (rev a1)

other hardware spec:
12th Gen Intel(R) Core™ i7-12700H

@dreamerlzl Can you check if the regression tests fail on your machine as well?

You can find the instructions here: QA:Testcase kernel regression - Fedora Project Wiki

Do both the tests with ./runtests.sh and ./runtests.sh -t performance → be aware that the latter can take some hours. Don’t forget the semanage ... steps noted in the document. One is before the tests, and one after.

Let us know the output except the “vulnerability status” at the end.

If any test fails, get rid of nvidia, check if it is really removed from kernel space with cat /proc/sys/kernel/tainted (the output has to be 0 after removing nvidia), and then do the tests again without nvidia!

Again, let us know the output of the tests without nvidia as well (and again, remove the “vulnerability status”).

Just in case that this ends up in a bug report: the outputs of the tests contain a line that begins with “Your log file is located at:” → save the log files to have them available for a bug report!