After downgrading or upgrading the kernel (tested with 5.8.18 and 5.10.rc5) the next boot is stuck at “A start job is running for /dev/disk/by-uuid/x (…s / no limit)” with the id being the root partition (/). The timer is increasing, but even after 5 minutes the boot does not continue.
The system is a Thinkpad T495 with a fairly stock Silverblue 33 (upgraded from F32). The disk swap was removed by removing the fstab entry and the swap partition.
My swap was already removed when trying the upgrade, got the same behaviour you are describing.
My system is encrypted, but it never asked for the password – I supposed that was the problem. Upgrading to a full Rawhide got me a working 5.10rc5 system.
Nice to know I’m not alone . So you removed the swap before the upgrade? Would be interesting to understand the underlying issue.
EDIT: So I’ve tried booting with debug and disabling log level. But the messages before the final “A start job is running…” are not helpful. What options do we have to debug this issue?
EDIT2: Tested again with 5.9.11 override (just a minor version over 5.9.10) and the same issue is visible.
EDIT3: Ok, now it gets really confusing. 5.9.11 was released today. The exact same packages that I tried to override were installed during the update and the boot is successful.
EDIT4: The sole difference between the two operations is that rpm-ostree override generates a initramfs, while rpm-ostree update skips it. This seems to be the root cause.
The sole difference between the two operations is that rpm-ostree override generates a initramfs, while rpm-ostree update skips it. This seems to be the root cause.
Right.
Does this also reproduce with just rpm-ostree initramfs --enable?
Hmm…so the uuid that boot is stalling on isn’t listed in active block devs. I wonder if it’s something like the UUID for the previous boot’s zram0, and the dracut run somehow picked that up?
I just tried enabling zram-generator on FCOS quickly and didn’t see this happen though.
Or does that UUID look like the one of the previous swap device you enabled?
Is that UUID present in /proc/cmdline or in grep -r e8673f81-4cb2-44b7-ad6a-268bdd36ff5a /etc ?
EDIT: The generated grub.cfg entries of a non-functional boot (generated from override) and a successful boot are exactly the same, besides the different deployment IDs.
Unfortunatly not. I’ve set SELinux permantently to “Permissive” after filing the bug report to continue with testing. The denial is still visible in the audit log but with a Permissive=1 attribute.
I gave up on my old system. Downloaded Silverblue 33. Complete fresh system, this time with btrfs and no swap. Updated it. Reboot. Override kernel. Boot is stuck on “A start job is running for /dev/disk/by-uuid”.
Can it have something to do with my main drive being a nvme drive (dev/nvme)?
If you can boot into the original install before updating, you could check the uuid of the nvme drive to make sure it is the device hanging. The bootable installer image can do the same too.
The fresh Silverblue 33 installation is in a functional state, even after updating and rebooting. The problem appears once an override triggers an “regenerate initramfs”. The boot afterwards is stuck at “A start job is running for /dev/disk/by-uuid” with a valid id of the root partition.