Cannot upgrade Silverblue

I have been running Silverblue for ~3 months now and it has been a mostly flawless experience, however I have run into my first major issue when trying to upgrade today:

[andythurman@rockhopper ~]$ rpm-ostree status 
State: idle
Deployments:
● ostree://fedora:fedora/33/x86_64/silverblue
                   Version: 33.20210204.0 (2021-02-04T01:41:19Z)
                BaseCommit: 39ac11c939acaa8cc7bb53635bd946b7c76ac74a805272ed1720b615b75dfcb4
              GPGSignature: Valid signature by 963A2BEB02009608FE67EA4249FD77499570FF31
       RemovedBasePackages: gnome-software gnome-software-rpm-ostree 3.38.0-2.fc33, firefox 85.0-8.fc33
           LayeredPackages: abrt-desktop fedora-workstation-repositories totem
[andythurman@rockhopper ~]$ rpm-ostree upgrade 
2 metadata, 0 content objects fetched; 788 B transferred in 1 seconds; 0 bytes content written
Checking out tree 635d8c6... done
Inactive base removals:
  gnome-software
  gnome-software-rpm-ostree
  firefox
Enabled rpm-md repositories: fedora-cisco-openh264 updates fedora updates-archive
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2020-08-25T19:10:34Z
rpm-md repo 'updates' (cached); generated: 2021-02-05T01:01:46Z
rpm-md repo 'fedora' (cached); generated: 2020-10-19T23:27:19Z
rpm-md repo 'updates-archive' (cached); generated: 2021-02-05T06:27:48Z
Importing rpm-md... done
Resolving dependencies... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
error: Sanity-checking final rpmdb: Didn't find package 'gstreamer1-plugin-openh264-1.16.2-2.fc33.x86_64'
[andythurman@rockhopper ~]$ 

Here is the full upgrade info:

[andythurman@rockhopper ~]$ rpm-ostree upgrade --preview 
1 metadata, 0 content objects fetched; 592 B transferred in 2 seconds; 0 bytes content written
Enabled rpm-md repositories: fedora-cisco-openh264 updates fedora updates-archive
Updating metadata for 'fedora-cisco-openh264'... done
rpm-md repo 'fedora-cisco-openh264'; generated: 2020-08-25T19:10:34Z
Updating metadata for 'updates'... done
rpm-md repo 'updates'; generated: 2021-02-05T01:01:46Z
Updating metadata for 'fedora'... done
rpm-md repo 'fedora'; generated: 2020-10-19T23:27:19Z
Updating metadata for 'updates-archive'... done
rpm-md repo 'updates-archive'; generated: 2021-02-05T06:27:48Z
Importing rpm-md... done
AvailableUpdate:
        Version: 33.20210205.0 (2021-02-05T00:50:33Z)
         Commit: 635d8c6febd3192f136aff3a57df44cbb82416c1589107ef1ab068530216b7bf
   GPGSignature: 1 signature
                 Signature made Thu 04 Feb 2021 07:50:42 PM EST using RSA key ID 49FD77499570FF31
                 Good signature from "Fedora <fedora-33-primary@fedoraproject.org>"
  SecAdvisories: FEDORA-2021-879c756377  Unknown    kernel-5.10.12-200.fc33.x86_64
                 FEDORA-2021-879c756377  Unknown    kernel-core-5.10.12-200.fc33.x86_64
                 FEDORA-2021-879c756377  Unknown    kernel-devel-5.10.12-200.fc33.x86_64
                 FEDORA-2021-879c756377  Unknown    kernel-modules-5.10.12-200.fc33.x86_64
                 FEDORA-2021-879c756377  Unknown    kernel-modules-extra-5.10.12-200.fc33.x86_64
                   CVE-2021-3347 kernel: Use after free via PI futex state
                   https://bugzilla.redhat.com/show_bug.cgi?id=1922249
       Upgraded: kernel 5.10.11-200.fc33 -> 5.10.12-200.fc33
                 kernel-core 5.10.11-200.fc33 -> 5.10.12-200.fc33
                 kernel-devel 5.10.11-200.fc33 -> 5.10.12-200.fc33
                 kernel-modules 5.10.11-200.fc33 -> 5.10.12-200.fc33
                 kernel-modules-extra 5.10.11-200.fc33 -> 5.10.12-200.fc33
                 openssh 8.4p1-4.fc33 -> 8.4p1-5.fc33
                 openssh-clients 8.4p1-4.fc33 -> 8.4p1-5.fc33
                 openssh-server 8.4p1-4.fc33 -> 8.4p1-5.fc33
                 osinfo-db 20201218-1.fc33 -> 20210202-1.fc33
                 pcre2 10.36-1.fc33 -> 10.36-3.fc33
                 pcre2-syntax 10.36-1.fc33 -> 10.36-3.fc33
                 pcre2-utf16 10.36-1.fc33 -> 10.36-3.fc33
                 pcre2-utf32 10.36-1.fc33 -> 10.36-3.fc33
                 systemd 246.7-2.fc33 -> 246.10-1.fc33
                 systemd-libs 246.7-2.fc33 -> 246.10-1.fc33
                 systemd-networkd 246.7-2.fc33 -> 246.10-1.fc33
                 systemd-pam 246.7-2.fc33 -> 246.10-1.fc33
                 systemd-rpm-macros 246.7-2.fc33 -> 246.10-1.fc33
                 systemd-udev 246.7-2.fc33 -> 246.10-1.fc33

Here are the steps I have tried so far:

  1. rpm-ostree cleanup -bprm

Same issue occurs.

  1. rpm-ostree reset

I am able to upgrade and remove my normal base packages, but am not able to overlay packages and receive the same or similar error.

My only guess is a Mirror is down, but I am not sure how to troubleshoot from this point on and would appreciate all help.

1 Like

Seems to be related to: https://bodhi.fedoraproject.org/updates/FEDORA-2021-937b45bf55

SB Bugtracker: Sanity-checking final rpmdb: Didn't find package 'annobin-9.49-1.fc33.x86_64' · Issue #124 · fedora-silverblue/issue-tracker · GitHub

Via @jlebon:

To anyone with layered packages sitting on a deployment with libsolv v0.7.17: once this errata is pushed to stable and we have a new compose with the new rpm-ostree, you’ll need to be running either from a deployment without libsolv v0.7.17 or one with libsolv v0.7.17 and this rpm-ostree.

Concretely, this means that you either need to rollback to an older deployment if you still have one, or you can just rpm-ostree usroverlay && rpm -Uvh rpm-ostree-{,libs}-...rpm . And then rpm-ostree upgrade (and make sure it shows this errata in the diff before rebooting into it).

Finally, figured out a temporary, hacky, solution. libsolv is the culprit, and it or rpm-ostree will probably need fixes upstream. Here is the full set of commands that got me to a working deployment:

cd Downloads
wget https://kojipkgs.fedoraproject.org//packages/rpm-ostree/2021.1/3.fc33/x86_64/rpm-ostree-2021.1-3.fc33.x86_64.rpm && wget https://kojipkgs.fedoraproject.org//packages/rpm-ostree/2021.1/3.fc33/x86_64/rpm-ostree-libs-2021.1-3.fc33.x86_64.rpm && wget https://kojipkgs.fedoraproject.org//packages/libsolv/0.7.15/1.fc33/x86_64/libsolv-0.7.15-1.fc33.x86_64.rpm
sudo rpm-ostree usroverlay && sudo rpm -Uvh rpm-ostree-{,libs-}*.rpm
systemctl restart rpm-ostreed.service
rpm-ostree override replace libsolv-0.7.15-1.fc33.x86_64.rpm

After a reboot your system should be useable again.

Edit: This may actually be a kernel bug. I will try and follow upstream and notify if a fix is made available.

3 Likes

https://bugzilla.redhat.com/show_bug.cgi?id=1925717

Someone knows a place to monitor when we could update without get these problematic deployments?

1 Like

I have layered pkg’s and also was using libsolv 0.7.15 prior to doing an update yesterday which went well and resulted in libsolv 0.7.17 being installed. It appears the issue you are experiencing is potentially related to a kernel regression, which should see a fix imminently as there was a solution already for rawhide in the comments I read.

Yes I said rawhide caught this and there is a fix coming. F33 is only just out not too long ago it was rawhide. But to answer your question, there is such a place https://docs.fedoraproject.org/en-US/fedora/f33/release-notes/

But how are you doing with libsolv 0.7.17? It appears to be causing both the issues I was having, as downgrading fixed them.

Currently using it now. I noticed it was replaced in my summary after the update completed but before reboot. Version 0.7.15 > 0.7.17, Sorry didn’t take a screenshot. I think if everyone was having this problem it would be a much more heavily viewed topic. I would hazard a guess it is somehow kernel related but to specific hardware potentially. From reading the links and subsequent ones you referenced above, I saw that it was something related to non-completion of a task during the rpm-ostree rebuild stage (where your layers get added) and further comments by the dev’s note the OOM killer and combined with the release notes on some AMD chips having memory leaks on F33 release, again kernel issue here. Of course, I haven’t dug deeply into it beyond that and am only offering an opinion at this point, not a solution.

I am on AMD right now. What a strange bug… Do you think just pinning 0.7.15 until a fix is the best way forward?

Since you have overrode it now, it should stay that way through updates until you reset the override. So if everything is working for you now, then pinning would be redundant IMO. I would hazard a guess that this will be fixed ASAP.

The override is what I was referring to as “pinning”. Thanks for the help!

Sorry, I misunderstood. I thought you were referring to ostree admin pin [OPTION…] INDEX. You’re welcome for any and all help I may be able to provide.

1 Like

Infra issue: Issue #9634: revert libsolv for rpm-ostree based systems - fedora-infrastructure - Pagure.io
Upstream tracker: libsolv-0.7.17-1.fc33 failure tracker · Issue #2548 · coreos/rpm-ostree · GitHub

Had the same issue as mentioned here. I was hesitant with trying the fix suggested, because I would need to review commands I haven’t used before and understand if they had any impact on the system. Luckily I had an idea that worked.

Another quick fix is to use:

$ rpm-ostree rollback

Since you can’t upgrade after you hit this bug I m guessing that most if not all will rollback to a version that is not affected. After rebooting you can try upgrading, since in my case I didn’t see the offending package in the packages to be updated, I m guessing the offending package has been pulled off.

1 Like
$ rpm-ostree rollback

I did that, reverting my Silverblue to 33.20210202.0 (2021-02-02T02:52:11Z)
with libsolv-0.7.15-1.fc33 and rpm-ostree-2021.1-2.fc33 but rpm-ostreed was still crashing.
Then I tried usroverlay + rpm-ostreed restart with rpm-ostree-2021.1-3, still the same:

# rpm-ostree upgrade
2 metadata, 0 content objects fetched; 788 B transferred in 1 seconds; 0 bytes content written
Checking out tree 655b209... done
error: Bus owner changed, aborting. This likely means the daemon crashed; check logs with `journalctl -xe`.

In journal:
rpm-ostreed.service: Main process exited, code=killed, status=11/SEGV
interesting part of stack trace is
Stack trace of thread 45644:
#0 0x00007f8bf9237a7f _ostree_loose_path (libostree-1.so.1 + 0x2fa7f)
#1 0x00007f8bf9241f7f load_metadata_internal.isra.0 (libostree-1.so.1 + 0x39f7f)
#2 0x00007f8bf924a375 ostree_repo_load_commit (libostree-1.so.1 + 0x42375)
#3 0x000055d345ad00f4 rpmostree_pkgcache_find_pkg_header (rpm-ostree + 0x690f4)
#4 0x000055d345b13f93 rpmostree_sysroot_upgrader_prep_layering (rpm-ostree + 0xacf93)

Is this worth reporting in rhbz or already known issue?

1 Like

I haven’t seen this issue encountered before. I don’t see anything saying that this would be related to the libsolv issue, but I also can’t say I’m knowledgeable enough to say that for certain. Likewise, I would open a bug on RHBZ or upstream.

nm, already filed as 1925584 – [abrt] rpm-ostree: _ostree_loose_path(): rpm-ostree killed by SIGSEGV

1 Like

https://bodhi.fedoraproject.org/updates/FEDORA-2021-9091468793

WOOHOOOOOO!!! :partying_face:

1 Like