Configuring grub2 to list latest kernels on boot

When grub starts on boot, an error message is shown as below:

error: …/…/grub-core/commands/loadenv.c:241:sparse file not allowd.

With a little bit of searching, I found this AskFedora QA and also this particular RedHat Bugzilla report.
From the two links mentioned, I concluded this is a mechanism conflict between BTRS filesystem and grub2, that BTRFS somehow doesn’t let grub2 to access grubenv and it’s files to write default kernel choice to be remembered next time booting up the system.

So I thought maybe giving up on this grub2 functionality (i.e. booting up with default kernel of choice) and configure grub2 to show the kernel list at startup would solve the problem but found no working parameter at grub2 manual.

As far as I know, this issue can make a problem for system boot-up, if by any chance, the default kernel encounters to a critical problem, leaving grub2 loading the corrupted default kernel over and over again and having grub2 no clue of that kernel is not healthy because the BTRFS filesystem doesn’t let grub2 making changes to grubenv.

My current /etc/default/grub configuration is as such:

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_SAVEDEFAULT="false"
#GRUB_DEFAULT=0
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="resume=UUID=bb658ee1-1717-46f6-8e32-db7472112093 rhgb quiet"
GRUB_DISABLE_RECOVERY="true"

PS. My partition scheme looks like this:

Now, how can I make changes to grub configuration to show the list of kernel instead of loading one as default?

The automatic partitioning scheme when installing fedora with btrfs is
/boot/efi – vfat
/boot – ext4
/ & /home – btrfs subvolumes.

I suspect that if you were to ensure that the kernel resided in an ext4 partition then grub would have no issues.

# ls /boot/grub2
fonts  grub.cfg  grubenv  i386-pc  themes


# ls /boot
config-5.14.16-201.fc34.x86_64                           memtest86+-5.31
config-5.14.17-201.fc34.x86_64                           symvers-5.14.18-200.fc34.x86_64.gz
config-5.14.18-200.fc34.x86_64                           System.map-5.14.16-201.fc34.x86_64
efi                                                      System.map-5.14.17-201.fc34.x86_64
elf-memtest86+-5.31                                      System.map-5.14.18-200.fc34.x86_64
extlinux                                                 vmlinuz-0-rescue-730854f859414ee8ab2aff2cbe878557
flask                                                    vmlinuz-5.14.16-201.fc34.x86_64
grub2                                                    vmlinuz-5.14.17-201.fc34.x86_64
initramfs-0-rescue-730854f859414ee8ab2aff2cbe878557.img  vmlinuz-5.14.18-200.fc34.x86_64
initramfs-5.14.16-201.fc34.x86_64.img                    xen-4.14.3.config
initramfs-5.14.17-201.fc34.x86_64.img                    xen-4.14.3.gz
initramfs-5.14.18-200.fc34.x86_64.img                    xen-4.14.gz
loader                                                   xen.gz
lost+found
1 Like

Hi, just want to know, are you enable the hibernation by using resume parameter intentionally? If yes, are the UUID are pointing to UUID of physical swap on /dev/sda4?

1 Like

The issue results from grubenv being located on a Btrfs “boot” volume at /boot/grub2/grubenv which is not a default configuration, but is permitted by the installer. The consequence is GRUB pre-boot environment cannot modify grubenv, the single example where it does so is resetting boot_success to a value of 0 so that GRUB knows whether to show the GRUB menu or hide it as a result of another environment variable menu_auto_hide=1

If GRUB sees that boot_success=0 then it’s assumed the boot was not successful, so it shows the GRUB menu. If it’s 1 then it assumes the boot was successful and the menu is not shown, but it also changes the value to 0 so that it’s reset in case the boot fails, it’s ready to show the menu the next time around. Since it can’t be changed when on Btrfs (or ZFS or dmcrypt or mdadm raid or LVM), autohide is always true because the reset to 0 never happens. So one work around, if you’re customizing /boot to be Btrfs anyway, is to disable autohide.

grub2-editenv - unset menu_auto_hide

Of course, dual booters always see the menu, since there’s code in the grub.cfg that detects multiboot setups and always shows the GRUB menu in that case.

NOTE: The reason GRUB pre-boot can’t change grubenv on Btrfs is because it doesn’t write via a file system driver that knows how to properly update all relevant Btrfs metadata, GRUB just overwrites the sectors containing the grubenv contents with new value(s). And Btrfs sees this as corruption because it checksums everything, including the grubenv file. There is a non-upstreamed patched by SUSE that stuffs the grubenv into the bootloader pad, which on Btrfs is quite significant in size and is an area that the bootloader can exclusively use reliably - it’s an area that isn’t subject to btrfs check/scrub/balance and isn’t observable or modifiable from the file system. It could be an option to bring this into Fedora.

2 Likes

No! That’s my default configuration and I’ve done nothing to it and I believe it’s related to Plymouth theme!

From what I’ve concluded so far, It’s more like grub’s fault that cannot work with BTRFS standards than the BTRFS filesystem itself, right?

And, based on the screenshot of my partition scheme provided in the main post, I don’t see a single “/boot” partition in my system but the root partition (i.e. “/”) being in BTRFS filesystem and also a LVM group, so that may have caused the problem, right? What is /dev/sda2/ then?

It is essentially because grub doesn’t fully support btrfs for /boot This is also true of other similar filesystems.

2 Likes

Interesting,

I want to replicate this thing. Is this only because using btrfs as /boot or there other steps to do?

As far as I can see, it is because /boot is btrfs. The default config for /boot is a separate partition that is ext4. It is explained by Chris Murphy in post #4. The system can boot, but because grub cannot handle writing to the btrfs file system it causes the problem.

1 Like

I just finished the installation. Here the layout.

About the resume arguments

I remember now why I’m so curious when found resume arguments on grub. It’s because before I using Fedora, I used Debian and on my Debian installation I have swap partition. When then I install Fedora, by default Fedora 34 partition layout, there are no swap partition and use zram as swap.

Because during the installation, Fedora (maybe) detect a physical swap partition, it’s automatically activate the swap that actually belong to Debian. The troubles come when i’m using Debian and switch to Fedora. Fedora (maybe) detect something on physical swap partition and think it’s from last Fedora session and then load it. The result is on abrt reporting lot of warning related to kernel things.

About kernel list on boot didn’t show

I also remember when installing Fedora 34, and this Fedora is the only operating system on my Laptop, the kernel boot list not shown.

About the error on grubenv

After did the installation, I still not replicate the error. Are there another steps to replicate?

Your boot partitions for both rawhide and normal OS /boot are ext4 according to that screenshot. /dev/sda2 & /dev/sda10. The error triggered by btrfs will not show with that arrangement.

Attempting to resume from a physical swap that was last used by a different OS can always be expected to show errors. Fedora automatically will use any (and all) available swap so to prevent that you would need swap to be defined in /etc/fstab so it only used what you designated. Before fedora introduced zram I think swap was always configured in /etc/fstab.

I have 3 Fedora on my Laptop with 1 Windows. The newest one are on /dev/sda12 using btrfs as filesystem type. On /dev/sda12 from the screenshot we can see there are 3 subvolumes and each mounted to /home, /boot, and /.

Update:

Another screenshot.