Hi guys, I got Fedora 32 Xfce worked very well on my Macbook Pro 13’ (Early 2015), until the kernel upgrade from 5.8.4 to 5.8.6 (and even to 5.8.9), which cause the LUKS encrypted LVM ext4 /home volume could no longer be mounted.
Hi,
it looks like your problem is actually with the root FS, not /home.
Looking at those boot messages, it seems like there is filesystem corruption on an XFS filesystem (apparently root), so it fails to mount. The home, efi etc. errors are dependency failures, probably because it can’t mount /home if there’s no / .Have you tried to fsck/xfs_repair that filesystem? Can you still boot with kernel 5.8.4?
Since I didn’t know what’s wrong, I have tried some ways:
Download a netinstall ISO, and reinstall the OS but keep LUKS LVM ext4 /home. The latest kernel was 5.8.7 then, failed.
Download an XFCE Live ISO, and reinstall the OS but keep LUKS LVM ext4 /home. The kernel fallbacked to the version same with the ISO kernel 5.6.6-300, which is the one I’m using now and it works very well.
Update kernel to the latest version (5.8.9-200), it fails to boot again.
So I have to boot with kernel 5.6.6 now, obviously I lost kernel 5.8.4 when I tried reinstalling the OS.
OK, that’s very strange, because the boot messages imply a filesystem error, and that should be there regardless of which kernel you’re using. Can you
fsck your filesystems (particularly the XFS one)
check if you can boot with a newer kernel when you only mount necessary partitons (/, /boot, possibly /boot/efi)? You can just comment everything else in /etc/fstab. You’ll have an empty home directory, so logging in won’t work, but it should not hinder the boot.
That will be helpful. Probably best to also fsck /boot and /boot/efi.
But all this is very weird.
FS errors should not be kernel-dependent
the system is trying (and failing) to activate dm-raid sets when you have no RAID array
fsck is failing for /boot & /boot/efi, when the system obviously has no issue booting from them
Since you have no RAID, you can disable dmraid-activation.service. I doubt that it is the underlying cause, but best to cross if off the list.
You can also check the journal for the failed boot (you can also do that from the live system by mounting the root FS and using journalctl --directory=<mountpoint>/var/log/journal), some of these failing services hopefully dumped some more information there.
Hm, unless my kinda tired eyes deceive me there are no failures of any kind in that log? Yet it clearly boots from 5.8.9, this is extremely weird. No more FS errors and RAID weirdness is a good thing, I guess, but I have to say I’m at a bit of a loss here.
F33 isos should have kernel 5.8.6, you could see if those boot & if you can unlock & mount your drives from within that live system.
I created a F33 Xfce Live USB and booted with it successfully. As a external storage, LUKS can be decrypted. Meanwhile, LVM root(xfs) and LVM home(ext4) both can be mounted (rw). So far so good.
Then I started the f33 (pre-release) installation procedure – reformatted the LVM root with xfs, kept the LVM home(ext4). It was a perfect installation, after which I tried booting with f33 (kernel 5.8.6). Unfortunately, the issue was still there.
See the journal here: https://paste.centos.org/view/b88fed31
I repeated the f33 installation, but reformatted the LVM root with ext4, kept the LVM home(ext4). The failure seemed to be same.
See the journal here: https://paste.centos.org/view/f28de1e3
At last, I reinstalled f32 again, and reformatted the LVM root with ext4, kept the LVM home(ext4). f32 (kernel 5.6.6) works very well.
It’s still very weird to me that those errors do not seem to be reflected in the log. Also interesting that the F33 live system boots - so 5.8.6 works when run from the live system, but not when run from an installed root FS. Maybe something to do with some kernel module that went missing?
Anyways, given that we’ve successfully excluded any filesystem problems, or anything related to the /home filesystem this really does look like some weird kernel bug.
I’d suggest you update your bug report with the info that
this error persists after a system reinstall with both F32 & F33
5.8.6 works from the live but not from the installed system
is independent of the root FS (ext4/xfs)
independent of whether /home gets mounted or not
and, hopefully the kernel devs will have some ideas.
I am assuming that you open LUKS after fully boot.
Can you list all packages to be upgraded?
My point is the problem may relate to dracut and initramfs. Live cd has a proper initramfs and after loading kernel, you use tools from / to decrypt LUKS. But the initramfs/dracut that is wrapped with installer cannot decrypt your / from LUKS.
Looks similar to this one. Happens in Rawhide once. Your log shows lots of lsetfilecon errors and EFI as vfat should not be able to have SELinux labels. So I guess grub is not properly run for the new kernel.
You are right. The log shows that LUKS is decrypted and swap is activated too. initrd has reached to its end of duty and / is loaded.
I just installed a fresh FC33 onto a kvm instance, and it seems your process stops at the last scan of Starting Create Static Device Nodes in /dev and thus coldplug devices and rebuild hardware database are not processed at all.
Then it is like a driver issue and I think it is safe to submit a bug report to Bugzilla.
I noticed that this was related to booting off of an SD card but I am having the same issue with the internal SSD. In the OP’s bug thread, it seems a kernel patch has been committed fixing this issue, I’m wondering if the OP still experiences this issue on kernel 5.8.14 or 5.8.15?