BTRFS x Kinoite

I recently went back to Kinoite and must say I am pretty confused by BTRFS.

It creates three subvolumes out of the box at level 5 - var, home, and root.

I created another – snapshots – that I thought it would be useful to have to set up automated snapshots.

But somewhere, I must have made a terrible mistake, because even though snapshots worked originally with my mini-script, the file paths are no longer being recognised now. I cannot delete the root snapshots either, which appear to be manipulating /sysroot (mystery to me how I was able to create the snapshot but can now not remove it, since I thought both creation and deletion of snapshot would have to interfere with metadata on that mountpoint).

Deleting snapshots by subvolid works for home and var, but not for root.

I assume it’s heavily discouraged/impossible to mount root as rw instead of ro?

Is there a knack to doing this with an immutable distro like Kinoite/Silverblue?

Absolute path (in terms of on-disk format) is best represented by:

# btrfs sub list -ta $MNT

It might be helpful to post the script to better understand where the confusion is coming from.

Subvolumes var root home are located at the top-level of the file system, a term that refers to the hidden, unnamed, undeletable subvolume created at mkfs time, and is the default default (sic) subvolume mounted when you do not specify a subvolume with the subvol/subvolid mount option. This default subvolume can be determined by

# sudo btrfs subvolume get-default /
ID 5 (FS_TREE)

Ergo, the subvolid is 5. Every subvolume (and subvolume snapshot) has a unique ID.

The btrfs layout used in Fedora is referred to as “flat” since they’re all located in subvolid 5 together, and then assembled at the appropriate FHS path using a mount option subvol/subvolid as found in /etc/fstab. These are bind mounts. In effect all Btrfs mounts are bind mounts, which then explains why we’re mounting the same file system multiple times in different locations.

It is allowed to create subvolumes in other subvolumes, this is the “nested” layout.

rpm-ostree uses the VFS mount option ro to force immutability on / but Btrfs read-only snapshots implement it at the file system level and not even root can make changes to a read-only subvolume or snapshot. Bind mounts can be ro or rw.

My best recollection is rpm-ostree mounts root subvolume to /sysroot and then does a bind mount of /sysroot to / making it read-only? (Eeek! I forget!) Anyway posting the results from mount|grep btrfs will help me refresh memory and also show the absolute path and subvolid for your mounted subvolumes further clarifying the layout.

A key thing to keep in mind is Btrfs is very permissive. It doesn’t really care about which layout you choose, or how you organize things, but it does matter in terms of policy and workflow (and avoiding confusion).

3 Likes

Thank you very much for taking the time to write this up, Chris. This is very useful context and I shall mull over this some more. The ‘sub list’ command is also very useful.

@chrismurphy

So my script was this:

#! /bin/bash
sudo btrfs subvolume snapshot /mnt/btrfs-root/root /mnt/btrfs-root/snapshots/root-$(date +%Y%m%d%H%M%S)
sudo btrfs subvolume snapshot /mnt/btrfs-root/home /mnt/btrfs-root/snapshots/home-$(date +%Y%m%d%H%M%S)
sudo btrfs subvolume snapshot /mnt/btrfs-root/var /mnt/btrfs-root/snapshots/var-$(date +%Y%m%d%H%M%S)

However, I thought I had to mount the subvols elsewhere like /mnt/btrfs-root and was unsure of how that affected both the folder structure in the BTRFS system as well as on the regular system. I have now undone this.

I have also been told that doing root snapshots may have been redundant anyway because of rpm-ostree – although I think the two cover different uses cases.

I am still not entirely sure how mounting the BTRFS subvol elsewhere is a way around this though – because wouldn’t this affect the way you address everything else? Or if you mounted it in both places wouldn’t that generate conflicts or even defeat the purpose of root being immutable? Hope that makes sense.

I guess conceptually/intuitively in my head, I thought of it as there being two file systems – the ‘real’ file system, which is the folders and stuff that I see when I rifle through my Linux folder system, and BTRFS, a file system that starts at the root and has these hierarchical subvolumes that I now believe I can even nest, although I think I am discouraged from doing so.

I guess the initial confusion came from the fact that /var, /home and /root were at the same level, whereas I thought root (or /) was supposed to be the very top level of all of them. I am still trying to come to terms with understanding the file system is usually mounted at ‘/’ and that is where the FS_TREE starts and the fact there is also root as a subvol. I guess in mounting the subvol root elsewhere, I can kind of decouple where it is in the filesystem and where it is in BTRFS?

Oh, and here’s my ‘mount | grep btrfs’:

/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /sysroot type btrfs (ro,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=258,subvol=/root)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on / type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=258,subvol=/root)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /etc type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=258,subvol=/root)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /usr type btrfs (ro,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=258,subvol=/root)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /sysroot/ostree/deploy/fedora/var type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=258,subvol=/root)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /var type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=256,subvol=/var)
/dev/mapper/luks-01d74a86-5fbe-4082-b0cc-82302ba5b934 on /var/home type btrfs (rw,relatime,seclabel,compress=zstd:1,ssd,space_cache=v2,subvolid=257,subvol=/home)

The absolute paths (on-disk) are discovered with btrfs subvolume -fa and it’s absolute paths that are used for the subvol= mount option.

The relative paths (FHS) are discovered with commands like tree ls and navigated with cd)

The mount command shows the mapping between them, and also controls the mapping by use of either subvolid/subvol mount commands, or btrfs subvolume set-default (which we don’t use in Fedora by default, therefore it’s subvolid=5, subvol=/)


the file system is usually mounted at ‘/’ and that is where the FS_TREE starts and the fact there is also root as a subvol.

Again, context matter. In the visible search path (FHS) / is a directory and a mount point. Per fstab the subvolume named “root” is being (bind) mounted to /.

In the context of Btrfs whether mounted or not, the subvolume / is the “top level” of the file system, and is represented with btrfs subvolume -a as <FS_TREE> for esoteric reasons only a Btrfs developer would know.

I guess in mounting the subvol root elsewhere, I can kind of decouple where it is in the filesystem and where it is in BTRFS?

In the relative path case (find, cs, ls, tree) these things can be put anywhere and multiple times because they’re just bind mounts.

There is only one absolute path to a subvolume. If we go further, the absolute path is just a familiar proxy for the on-disk format, how things are encoded on disk is entirely different. The absolute path syntax is borrowing from the familiar, and hence the confusion.

Subvolumes do mostly behave like directories, they even show up with the d symbol in ls -l

when I rifle through my Linux folder system, and BTRFS, a file system that starts at the root and has these hierarchical subvolumes that I now believe I can even nest, although I think I am discouraged from doing so.

You can nest them. The pros and cons of nesting is not immediately obvious, it all comes down to workflow.

Nested subvolumes don’t need an fstab entry to put them in the correct location, you just use a subvolume instead of a directory. But what if you snapshot a subvolume that contains a subvolume? There are no recursive snapshots in the user space tools so that limitation can be a good thing, e.g. I make ~/.cache a subvolume in order to exclude cache content from my home subvolume snapshots.

I have also been told that doing root snapshots may have been redundant anyway because of rpm-ostree – although I think the two cover different uses cases.

They are redundant but one is not really aware of the other. The conflict arises if doing a rollback with btrfs snapshots, that’s again not immediately obvious how to do it because Btrfs has no snapshot and rollback policies at all. It’s just functionality and the workflow and policies are up to other utilities - every utility handles this differently. Therefore assembly is different, and don’t forget the bootloader has hints for how to assemble a system: some of the assembly depends on the boot parameter root=UUID=, and for btrfs there’s rootflags=subvol=$NAME which is the same as mount -o option, typical on Fedora is the subvolume name is root; but then there is also an ostree deployment hash, so now that implies some centralized coordination to make sure the bootloader has the proper hint for mounting the proper ostree deployment.

I am still not entirely sure how mounting the BTRFS subvol elsewhere is a way around this though – because wouldn’t this affect the way you address everything else?

I need an example. Just remember that mounting subvolumes elsewhere, they’re just “mirrors” by that I mean they’re multiple instances of the same thing. Change one and the other immediately changes.

Or if you mounted it in both places wouldn’t that generate conflicts or even defeat the purpose of root being immutable?

It could impact immutability. Bind mounts are managed at the VFS level, same as ro and rw. And each bind mount can have separate ro or rw flags. So it’s only as immutable as the root user (or user in wheel group) is secure.

Therefore I’d use rpm-ostree tools for making any changes to the system, which permits rpm-ostree to track those changes.

If you were to have your home subvolume mounted in different locations, making changes in one immediately makes changes in the other - they aren’t different, they’re the same subvolume mounted in two locations.

See also the Fedora Btrfs Matrix channel for interactive conversations about Btrfs in Fedora. (I’m cmurf over there) - Some issues are better discussed interactively.

Wow, what a read. You are definitely light years ahead with this stuff. Thanks for the Matrix invite – I will join now!