My Silverbox OS no longer updates, probably due to an error with grub2-mkconfig (?)

Hello everyone!

I’m quite new here, I’m not sure if this is the place to ask for help or report bug. Just tell me if I need to post this somewhere else.

I realized my silverblue host no longer updates: more specifically, it does update, but I end up booting in the same “version” over and over again (same when I try removing a package).

After updating (but before rebooting), I can see what the next “version” would be by inspecting “rpm-ostree status”, but when I boot, there is no new grub entry, and I reboot on the same system.

Here are some informations:

$ sudo rpm-ostree status
[sudo] password for slt: 
State: idle
Warning: failed to finalize previous deployment
         error: Bootloader write config: grub2-mkconfig: El proceso hijo terminó con el código 1
         check `journalctl -b -1 -u ostree-finalize-staged.service`
AutomaticUpdates: disabled
Deployments:
● ostree://fedora:fedora/31/x86_64/silverblue
                   Version: 31.20191130.0 (2019-11-30T00:35:48Z)
                BaseCommit: 3acfe02779b60bf385950fac7b0bcf1de58b10bedd263e39dd2d70d6448903b5
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
           LayeredPackages: VirtualBox aircrack-ng fedora-workstation-repositories fuse-exfat gnome-tweaks java-1.8.0-openjdk ltrace nmap p7zip-gui strace tcpdump tmux vim vlc
             LocalPackages: rpmfusion-free-release-31-1.noarch rpmfusion-nonfree-release-31-1.noarch

  ostree://fedora:fedora/31/x86_64/silverblue
                   Version: 31.20191123.0 (2019-11-23T00:39:51Z)
                BaseCommit: 2c0cb651d74ad99eaaeff4929dbeb707bb8fc66d5f655e1bb777b9762037cdab
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
           LayeredPackages: VirtualBox aircrack-ng fedora-workstation-repositories fuse-exfat gnome-tweaks java-1.8.0-openjdk ltrace nmap p7zip-gui strace tcpdump tmux vim vlc
             LocalPackages: rpmfusion-free-release-31-1.noarch rpmfusion-nonfree-release-31-1.noarch

and the journal extract:

$ sudo journalctl -b -1 -u ostree-finalize-staged.service
[sudo] password for slt: 
-- Logs begin at Mon 2019-09-30 15:07:38 CEST, end at Fri 2019-12-06 16:43:39 CET. --
déc. 06 16:25:09 nsfw-localdomain systemd[1]: Started OSTree Finalize Staged Deployment.
déc. 06 16:29:23 nsfw-localdomain systemd[1]: Stopping OSTree Finalize Staged Deployment...
déc. 06 16:29:24 nsfw-localdomain ostree[17359]: Finalizing staged deployment
déc. 06 16:29:29 nsfw-localdomain ostree[17359]: Copying /etc changes: 11 modified, 0 removed, 49 added
déc. 06 16:29:35 nsfw-localdomain ostree[17359]: error: Bootloader write config: grub2-mkconfig: El proceso hijo terminó con el código 1
déc. 06 16:29:35 nsfw-localdomain systemd[1]: ostree-finalize-staged.service: Control process exited, code=exited, status=1/FAILURE
déc. 06 16:29:35 nsfw-localdomain systemd[1]: ostree-finalize-staged.service: Failed with result 'exit-code'.
déc. 06 16:29:35 nsfw-localdomain systemd[1]: Stopped OSTree Finalize Staged Deployment.
déc. 06 16:29:35 nsfw-localdomain systemd[1]: ostree-finalize-staged.service: Consumed 4.138s CPU time.

I used to know my way around linux and all, but I have to admit it’s no longer the case!

Any help would be greatly appreciated!

What should I do?

Many thanks in advance,

samuel

This bug mentions a similar issue: https://github.com/coreos/rpm-ostree/issues/1598 but I don’t know what the resolution is exactly.

Can you try rpm-ostree upgrade then sudo OSTREE_DEBUG_GRUB2=1 ostree admin finalize-staged and print the output you get?

Thank you both for your comments.

@dustymabe: well, it does look similar indeed :face_with_raised_eyebrow: There are many grub2-mkconfig related bug reports currently, and this one had not caught my eyes at first sight!

@jlebon: see below

  $ sudo OSTREE_DEBUG_GRUB2=1 ostree admin finalize-staged 
  [sudo] password for slt: 
  /usr/bin/grub2-editenv: error: invalid environment block.
  Generating grub configuration file ...
  /usr/bin/grub2-editenv: error: invalid environment block.
  error: Bootloader write config: grub2-mkconfig: Child process exited with code 1

I hope that’s helpful to you!

Hmm, looks like you might need to recreate your grubenv (e.g. grub2-editenv create). Not sure if there are steps you should take beforehand though. /cc @javierm

Hello @jlebon Jlebon,

I was patiently waiting for Javierm comment, but, how can I make sure I can run that command directly?

Or should I run it and see how it goes since I also have the other atomic system waiting for me to boot on in case of a failure?

:face_with_raised_eyebrow:

Thanks!

@slt sorry I was on holidays so didn’t see this thread before. Yes, @jlebon suggestion is correct. That’s the command to re-generate a grubenv file.

I wonder why the grubenv got borked though. Could you please first backup and share that file before running grub2-editenv create to understand what happened?

@javierm, thanks for your answer!

The file “/boot/grub2/grubenv” is empty:
sudo ls -l /boot/grub2/ total 4 lrwxrwxrwx. 1 root root 25 juin 17 21:11 grubenv -> ../efi/EFI/fedora/grubenv drwxr-xr-x. 3 root root 4096 janv. 1 1970 themes sudo ls -l /boot/efi/EFI/fedora
total 6412
-rwx------. 1 root root 110 janv. 1 1980 BOOTX64.CSV
drwx------. 2 root root 4096 janv. 1 1980 fonts
-rwx------. 1 root root 7963 déc. 1 14:58 grub.cfg
-rwx------. 1 root root 3573 déc. 9 20:19 grub.cfg.new.new
-rwx------. 1 root root 7963 nov. 24 21:18 grub.cfg.old
-rwx------. 1 root root 0 déc. 1 15:02 grubenv
-rwx------. 1 root root 1739592 janv. 1 1980 grubx64.efi
-rwx------. 1 root root 1159560 janv. 1 1980 mmx64.efi
-rwx------. 1 root root 1210776 janv. 1 1980 shim.efi
-rwx------. 1 root root 1210776 janv. 1 1980 shimx64.efi
-rwx------. 1 root root 1204496 janv. 1 1980 shimx64-fedora.efi

I make a tarball of /boot before running the command, just in case you want to look at it

And… success! Recreating the grubenv worked and I’m now running on an newly updated system!

Many thanks!

This happened to me too. But i forgot to save the files before. Will report if the next upgrade also “fails”.

Edit: Its still broken for me. I did a “rpm-ostree install gnome-tweaks” and after a reboot it seems as i had never done that.

For what it’s worth, I opened a bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1780857

Its still broken for me. I did a “rpm-ostree install gnome-tweaks” and after a reboot it seems as i had never done that.

So i did it again and the result is:

State: idle
AutomaticUpdates: disabled
Deployments:
  ostree://fedora:fedora/31/x86_64/silverblue
                   Version: 31.20191211.0 (2019-12-11T01:41:34Z)
                BaseCommit: 500e1a1bd0c3a5d1e98857852ad1fce56ab476a95923fb765ad3ce276d8af780
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
                      Diff: 2 added
           LayeredPackages: gnome-tweaks

● ostree://fedora:fedora/31/x86_64/silverblue
                   Version: 31.20191211.0 (2019-12-11T01:41:34Z)
                    Commit: 500e1a1bd0c3a5d1e98857852ad1fce56ab476a95923fb765ad3ce276d8af780
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4

  ostree://fedora:fedora/31/x86_64/silverblue
                   Version: 31.20191130.0 (2019-11-30T00:35:48Z)
                BaseCommit: 3acfe02779b60bf385950fac7b0bcf1de58b10bedd263e39dd2d70d6448903b5
              GPGSignature: Valid signature by 7D22D5867F2A4236474BF7B850CB390B3C3359C4
           LayeredPackages: abcde picocom

But ostree-finalize-staged.service did NOTHING:

-- Reboot --
Dez 11 17:50:32 shire systemd[1]: Started OSTree Finalize Staged Deployment.

@gierthi And what about the grubenv file? is it empty for you too?

When i manually run “sudo ostree admin finalize-staged” after the upgrade i get a new environment to boot into. BUT i always get the grub boot menu now with the options, which should only show when a boot failed.

My grubenv says now:

# GRUB Environment Block
kernelopts=root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.luks.uuid=luks-c72c07af-4b71-4554-af44-c15db2b40df9 rd.lvm.lv=fedora/swap rhgb quiet
boot_success=0
boot_indeterminate=0
###########################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################

After a “rpm-ostree install picocom” it says:

# GRUB Environment Block
kernelopts=root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.luks.uuid=luks-c72c07af-4b71-4554-af44-c15db2b40df9 rd.lvm.lv=fedora/swap rhgb quiet
boot_success=1
boot_indeterminate=0

EDIT: Now it works! No idea why? I still get the verbose grub bootmenu on start but it actually added picocom without me doing something manually.

EDIT2: Reenabled the auto_hidden grub menu again. Now everything seems normal.