Boot hangs after upgrading to kernel 6.8.10 - cannot boot into previous kernels, only rescue

After upgrading to 6.8.10, I can no longer boot into my system.

I have two other kernels installed, 6.8.9-300 and 6.8.8-300, and a rescue kernel, 6.8.5-301. I can only boot into the rescue kernel.

Here is the output of journalctl -b -1 | grep -i failed while I was in the rescue kernel checking the last failed boot logs:

# journalctl -b -1 | grep -i failed

May 24 16:37:24 lemp13 kernel: ACPI: _OSC evaluation for CPUs failed, trying _PDC
May 24 16:37:37 lemp13 kernel: intel-hid INT33D5:00: failed to enable HID power button
May 24 16:37:24 lemp13 kernel: ACPI: _OSC evaluation for CPUs failed, trying _PDC
â–‘â–‘ Subject: A start job for unit dev-disk-by\x2duuid-e31a2a9f\x2d3ed1\x2d4b72\x2db899\x2d21a5e141e2d1.device has failed
May 24 16:37:37 lemp13 kernel: intel-hid INT33D5:00: failed to enable HID power button

Here is the output of dmesg:

[    0.260209] ACPI: _OSC evaluation for CPUs failed, trying _PDC
[   16.776257] zram_generator::generator[1041]: modprobe "zram" failed, ignoring: code exit status: 1
[   16.776612] (sd-exec-[1011]: /usr/lib/systemd/system-generators/zram-generator failed with exit status 1.
[   16.985861] systemd[1]: mullvad-early-boot-blocking.service: Failed with result 'exit-code'.
[   16.986074] systemd[1]: Failed to start mullvad-early-boot-blocking.service - Mullvad early boot network blocker.
[   16.986249] audit: type=1130 audit(1716582547.337:4): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=mullvad-early-boot-blocking comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
[   23.516929] i915 0000:00:02.0: [drm] *ERROR* GT1: GSC proxy handler failed to init

I’ve tried the following solutions posted around the forum, but none have worked:

Removing resume from cmdline

(1) Removing resume from cmdline, (2) running dracut, and (3) regenerating grub config as suggested in this post Boot waits indefinitely after kernel update to 6.8.10 - #26 by hackguy

FWIW, I didn’t need to remove the resume argument, because my cmdline didn’t have it… I mainly just removed quiet to see messages on boot.

Here’s the output of my /etc/default/grub/, /etc/kernel/cmdline, and /proc/cmdline when I was in the 6.8.5-301 rescue kernel:

# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.luks.uuid=luks-e31a2a9f-3ed1-4b72-b899-21a5e141e2d1 rhgb"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true

# cat /etc/kernel/cmdline
root=UUID=c9edf42a-5b88-41d4-ba7f-6574091e09a3 ro rootflags=subvol=root rd.luks.uuid=luks-e31a2a9f-3ed1-4b72-b899-21a5e141e2d1 rhgb 

# cat /proc/cmdline
BOOT_IMAGE=(hd1,gpt2)/vmlinuz-0-rescue-7b07cf6d588547bb9800b4b11ecf0839 root=UUID=c9edf42a-5b88-41d4-ba7f-6574091e09a3 ro rootflags=subvol=root rd.luks.uuid=luks-e31a2a9f-3ed1-4b72-b899-21a5e141e2d1 rhgb

Reinstalling 6.8.10-300 in a chrooted environment

I chrooted into my system using a live usb and reinstalled 6.8.10-300.
This didn’t solve it.

In the rescue kernel, the network drivers aren’t loaded:

# lspci -k
00:00.0 Host bridge: Intel Corporation Device 7d02 (rev 04)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:02.0 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Graphics] (rev 08)
	DeviceName: VGA compatible controller
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: i915
00:06.0 PCI bridge: Intel Corporation Device 7eca (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:07.0 PCI bridge: Intel Corporation Meteor Lake-P Thunderbolt 4 PCI Express Root Port #0 (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:0a.0 Signal processing controller: Intel Corporation Device 7d0d (rev 01)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:0d.0 USB controller: Intel Corporation Meteor Lake-P Thunderbolt 4 USB Controller (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: xhci_hcd
00:0d.2 USB controller: Intel Corporation Meteor Lake-P Thunderbolt 4 NHI #0 (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: thunderbolt
00:14.0 USB controller: Intel Corporation Meteor Lake-P USB 3.2 Gen 2x1 xHCI Host Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: xhci_hcd
00:14.2 RAM memory: Intel Corporation Device 7e7f (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:14.3 Network controller: Intel Corporation Meteor Lake PCH CNVi WiFi (rev 20)
	Subsystem: Intel Corporation Wi-Fi 6E AX211 160MHz
00:15.0 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #0 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:15.1 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #1 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:19.0 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #4 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:19.1 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #5 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:1c.0 PCI bridge: Intel Corporation Device 7e38 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:1f.0 ISA bridge: Intel Corporation Device 7e03 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:1f.3 Audio device: Intel Corporation Meteor Lake-P HD Audio Controller (rev 20)
00:1f.4 SMBus: Intel Corporation Meteor Lake-P SMBus Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:1f.5 Serial bus controller: Intel Corporation Meteor Lake-P SPI Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller S4LV008[Pascal]
	Subsystem: Samsung Electronics Co Ltd Device a801
	Kernel driver in use: nvme
2d:00.0 SD Host controller: O2 Micro, Inc. SD/MMC Card Reader Controller (rev 01)
	Subsystem: O2 Micro, Inc. Device 0002
	Kernel driver in use: sdhci-pci

However, the network drivers loaded fine in previous kernels and in the live usb. This is the output of lspci -k from the live usb:

00:00.0 Host bridge: Intel Corporation Device 7d02 (rev 04)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: igen6_edac
	Kernel modules: igen6_edac
00:02.0 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Graphics] (rev 08)
	DeviceName: VGA compatible controller
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: i915
	Kernel modules: i915, xe
00:06.0 PCI bridge: Intel Corporation Device 7eca (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:07.0 PCI bridge: Intel Corporation Meteor Lake-P Thunderbolt 4 PCI Express Root Port #0 (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:0a.0 Signal processing controller: Intel Corporation Device 7d0d (rev 01)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel_vsec
	Kernel modules: intel_vsec
00:0d.0 USB controller: Intel Corporation Meteor Lake-P Thunderbolt 4 USB Controller (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: xhci_hcd
00:0d.2 USB controller: Intel Corporation Meteor Lake-P Thunderbolt 4 NHI #0 (rev 10)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: thunderbolt
	Kernel modules: thunderbolt
00:14.0 USB controller: Intel Corporation Meteor Lake-P USB 3.2 Gen 2x1 xHCI Host Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: xhci_hcd
00:14.2 RAM memory: Intel Corporation Device 7e7f (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:14.3 Network controller: Intel Corporation Device 7e40 (rev 20)
	Subsystem: Intel Corporation Device 0094
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi
00:15.0 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #0 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:15.1 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #1 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:19.0 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #4 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:19.1 Serial bus controller: Intel Corporation Meteor Lake-P Serial IO I2C Controller #5 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-lpss
00:1c.0 PCI bridge: Intel Corporation Device 7e38 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: pcieport
00:1f.0 ISA bridge: Intel Corporation Device 7e03 (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
00:1f.3 Audio device: Intel Corporation Meteor Lake-P HD Audio Controller (rev 20)
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel, snd_sof_pci_intel_mtl
00:1f.4 SMBus: Intel Corporation Meteor Lake-P SMBus Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: i801_smbus
	Kernel modules: i2c_i801
00:1f.5 Serial bus controller: Intel Corporation Meteor Lake-P SPI Controller (rev 20)
	Subsystem: CLEVO/KAPOK Computer Device 2624
	Kernel driver in use: intel-spi
	Kernel modules: spi_intel_pci
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller S4LV008[Pascal]
	Subsystem: Samsung Electronics Co Ltd Device a801
	Kernel driver in use: nvme
	Kernel modules: nvme
2d:00.0 SD Host controller: O2 Micro, Inc. SD/MMC Card Reader Controller (rev 01)
	Subsystem: O2 Micro, Inc. Device 0002
	Kernel driver in use: sdhci-pci
	Kernel modules: sdhci_pci

Removed audio

Somehow this got fixed, and I don’t know what fixed it. I was in the chrooted environment, just upgraded the system packages, and rebooted… and it rebooted fine.

False alarm?

2 Likes

Don’t think so, but kernel 6.8.10 has been very strange.

Boot failures are definitely alarming. Some changes may require a power-off reboot or some other action that forces a reload of binary firmware blobs. On dual boot systems there can be issues that only appear if you boot one OS and then switch to another. Your journalctl and dmesg output can be discovered in searches by others having similar issues, which may help discover a pattern that can lead to understanding what went wrong.

You should run hardware checks to rule out memory or disk issues.