Silverblue installation adventures (mostly Anaconda bugs I guess?), and questions about data loss


#1

Installation fun

Ok, so recently I went out on the adventure of installating Silverblue. And it was…an adventure, I guess! I’m putting this here partly to ask for help and partly to warn people of everything not to do as they try out Silverblue.

(Side note: kudos to whoever runs the @teamsilverblue Twitter account!)

So I made a stupid mistake when I started, disregarded the suggestions to use automatic partitioning, and went with a manual partition layout. Yeah, it was a terrible idea, I know.

First issue:

I noticed an assertion failure in the logs. The problem lies here, where OSTree seems to make the assumption that, if a grub.cfg is present, GRUB must be booting the OSTree system at the moment. Normally this would make sense, but I had an old grub.cfg lying around from when I tried Kubuntu several months ago. Since there are no OSTree deployments in the file, the assertion fails. Solution: remove grub.cfg.

Second problem:

The issue here was that, for some reason, the target mount point didn’t exist. The “solution” was to change over to the root shell and create a shell script (stored on an unrelated partition) that basically did this:

dir=/mnt/sysimage/ostree/deploy/fedora-workstation/deploy/hash-that-I'm-too-lazy-to-retype-goes-here/boot/efi
while [ ! -d $dir]; do mkdir $dir || sleep 0.1; done

(Of course this also throws a ton of errors until the parent directories are created, but I wasn’t particularly concerned with trying to make this robust…)

Now came the issue that finally won over me:

Because OSTree couldn’t find my deployment:

I never actually solved this. From my limited understanding based on looking through the OSTree source on my phone (more painful that you’d think), this is what seemed to be happening:

  • The ostree admin deploy command that Anaconda ran didn’t add the deployment to grub.cfg because it didn’t exist.
  • grub2-mkconfig created a grub.cfg that didn’t have any deployments in it, because there was no original file with any deployments marked.
  • Now, when ostree admin deploy ran inside the chroot, it tried to read the deployment list from the now-present grub.cfg…but there were none.

At some step here I figure the deployment was supposed to be present, so I’m not sure what exactly happened.

While trying to debug this, I also noticed that GRUB was getting installed into /boot/efi/EFI/EFI/fedora. Apparently, when Anaconda runs cp -r, it ends up doing something like cp -r /somewhere/EFI /boot/efi/EFI. Since /boot/efi/EFI already existed, however, it ended up nesting the directories. I manually copied the fedora directory out.

Home partition…not-so-fun

Now, this is where things kinda sucked. Some background:

When I first got this computer, I had never used a UEFI system before, so I didn’t realize it needed an EFI system partition. In addition, I made the terrible mistake of installing Linux before Windows. The result was that my partition order is a bit…odd:

Hard disk #1:

  • First partition was Arch Linux.
  • Second partition is the swap (that I didn’t actually use).
  • Third partition was the EFI system partition.
  • Fourth and fifth partitions were from Windows
  • Sixth, seventh, and eighth are all from my Hackintosh setup.

Hard disk #2:

  • First and only partition is the home partition.

When I was installing Silverblue, I wanted to try and preserve this. When I had picked the manual setup, I set:

  • /dev/sda1 was re-formatted and mounted as /.
  • /dev/sda3 was mounted as /boot/efi.
  • /dev/sdb1 was mounted as /var/home.

This is where I screwed up. I had taken the /var/home mount point from this blog post. I’m not sure if that was the issue, or when at this point I tried to run the ostree admin deploy, ostree admin instutil, and ostree admin grub2-generate commands by myself. But, at that point, I had still tried to boot up the system (which failed, as you might imagine), which never started.

At that point, I decided just have Silverblue automatically partition itself on my second drive. I was going to resize the home partition there, but first I decided to mount it (bad idea) to check and see how much space I had.

That’s when I realized it was empty.

I know at a previous mount I had checked it, and it had all its contents, so I’m not sure what exactly went wrong.

Having damaged partition tables and data several times in the past (you’d think I’d have learned by now), I knew I had to immediately unmount it and not touch it. I just deleted my previous Arch partition and let Silverblue set up everything automatically. Right off the bat, I noticed two mistakes I had made:

  • I needed a separate /boot partition.
  • When Anaconda created the home partition, it mounted it as /home, not /var/home. I’m not sure if this is where the issue stemmed from.

This time, the installation worked perfectly, thereby proving that the last 2 days of my life + the 5 years of data could’ve been saved if I had listened to that little piece of advise: use automatic partitioning. I do have a suggestion: make this an actual warning that you’ll spent insane amounts of time if you think you know more than OSTree does.

Now comes question part of this post: what exactly happened to my home partition? Right now I’m taking ext4magic for a spin (after extundelete failed miserably), but I’m still not sure if this was a formatting issue, or if at some point everything just got deleted (maybe when I tried booting the broken install, or during one of the failed installation attempts)… but it’d probably help out my recovery efforts if someone had a rough idea of what happened.

Results

It works pretty much perfectly I guess. For Chrome, I took Endless OS’s Chrome launcher, realized I had no clue what it was doing, and patched it in random places so that it would run inside the Flatpak (starting Chrome via flatpak-spawn --host).

Only bad part I guess would be that I can no longer go to random /r/linux posts and be like I USE ARCH. (I use Silverblue doesn’t quite roll off the tongue as well, and I fear someone’s going to ask what Pokemon has to do with Linux. ¯\_(ツ)_/¯)

To summarize:

  • Maybe some of the installation issues are worth looking into?
  • It’d be great if anyone knew where my files went!!! :smiley:

Thanks in advance!


#2

Great post - thank you! I was writing on the Twitter account by the way, really appreciate that you took the time to sum it all up here. With bugs it’s not like with Pokemon, though - best not to catch 'em all. :sweat_smile:

Regarding the home partition, I think you might have just overwritten the pointers to it with the manual setup - I hope some of the recovery tools you’re using can help. Maybe someone else who reads this would know if it’s possible to recover somehow?

I’ll try and set it up with manual partitioning as well to see how it goes - usually I do manual partitioning, too, because I am never happy with automatic setup, but for my Silverblue installation I just did it the standard way the first time.


#3

Well in this case it was less like catching bugs and more like getting bitten by them… :sweat_smile:

you might have just overwritten the pointers to it with the manual setup

What exactly do you mean by “pointers” here? The file system’s offset on the disk?


#4

I’ve had to try it once because I rm -rf’d my system. Yup, it happens. Cough. That was a decade ago and I can’t remember which tools I used to recover - unfortunately only worked on some of the data. Hope you get all of it back.

What I meant with pointers: https://superuser.com/a/1107109


#5

There is livecds used for forensic and they can recover data quite well. There is also http://www.forensicswiki.org/wiki/Main_Page

I would give TestDisk / PhotoRec a try, that’s a rather mature tool and working quite well.


#6

I tried TestDisk and PhotoRec… It found some stuff but it’s also still missing a lot of it. I might end up trying something like R-Studio next. Thanks for the link though!