Dnf + BTRFS Snapshots: what stands in the way?

ngompa · October 15, 2024, 4:10pm

There is also another issue: GRUB is not configured in Fedora to use the bootloader slack space in the Btrfs volume for grubenv by default. I’m not actually sure how to get it to use that space for GRUB variables (like the autohide menu thing), but once we can do that, the /boot volume on ext4 can be eliminated for a btrfs subvolume.

emanuc · October 15, 2024, 5:21pm

For your information, the openSUSE community has created a tool to support Btrfs snapshots. Maybe one day we could use it on Fedora.

chrismurphy · October 15, 2024, 8:03pm

We are in some ways a victim of our own success. If updates or upgrades were more risky, we’d be in a better position to mitigate that risk with snapshots and rollbacks.

A most imperfect metaphor for implementing any snapshot+rollback regime: we’re going to need to dance in a closet full of open tubes of toothpaste with a mandate to not make a mess.

As is often the case “what problem are you trying to solve?” Are we presuming a solution, and then trying to find problems it fixes?

There could be 80 valid snapshot and rollback designs, each with tradeoffs. Any design will risk scope creep. Such designs differ depending on whether the emphasis is on user data or on system uptime/recovery from a bad update.

We need to accept some iterations. But iterations in a released edition of Fedora binds us to supporting the ensuing layout for a good deal of time. We have no official policy that I’m aware of on how many years we would(n’t) support a layout. We test Fedora n-1 and n-2, so perhaps unofficially it’s 2 releases. But were we to actually tell people layout A is no longer supported you have to clean install - that’d be a first and it’d be unwelcome.

Most any design related to system rollbacks touches bootloader stuff, which is hilarious madness.

All of this vaguley suggests we start first with a spin. Or perhaps this becomes the default behavior of Fedora Rawhide when on Btrfs, and such a feature is disabled (initially) with released editions. Or possibly even focus on protecting/preserving user data rather than the system. We can reinstall the system.

emanuc · October 15, 2024, 8:24pm

Perfect timing! I took advantage of the snapshot feature due to a regression in kernel 6.11. Yes, I could have selected a different kernel from GRUB, but I quickly resolved the issue by rollback to a few hours ago and updating with dnf while excluding the kernel update

sudo dnf --refresh  distro-sync --exclude=kernel\*

mjg · October 16, 2024, 8:39am

Yes, but this would have been solved by choosing a different kernel in grub to get a working system, plus - possibly - dnf history undo, which kinda proves @chrismurphy 's point.
It also shows that you had to redo unrelated updates.

Really, snapshots show their strength when the system cannot be booted otherwise, or there are more changes than just package updates (e.g. config changes left by an update or unrelated). In that case they may even “work better” than rebasing/pinning an atomic distro because a snapshot typically includes /etc.

lumiere · October 16, 2024, 9:23am

I may not be a very experienced Linux user, but I’ve used sysguides’ articles and videos for my installation. As far as I can tell, there is nothing suspicious about them and they are incredibly useful. I managed to get full disk encryption (including boot) and snapper to work, basically what I would get from Tumbleweed. He is also very responsive in the comments on his site when it comes to troubleshooting.

Snapshots have been working well, as far as I’ve tested.

To be specific, I used the Fedora 39 video and accompanying article: https://www.youtube.com/watch?v=JvfCieWkXxI

jaybe · October 16, 2024, 9:40am

Snapshots that work are just easy to understand. You can roll back to a previous “known good” if needed or wanted. That gives peace of mind

It is not only for failed updates. If you want to experiment with a new package or system setup that includes configuration and/or uninstalling packages it replaces, it can get quite complex to “roll back” using other tools. Before starting an experiment like that just do a manual snapshot, and you can safely move on. If the experiment is not succesful, roll back to the manual snapshot and all is good

It is a very good idea to have home directories as an optional/separate option so you can include them if you have used an application that heavily modifies ~/.config or similar, and exclude them for system only rollbacks.

While I have indeed used snapper to roll back failed updates on Tumbleweed (and moved to Fedora after needing that a bit too often ), on Debian I almost exclusively used timeshift for safely experimenting with my servers.

I find timeshift simpler/easier than snapper, but timeshift is (to my knowledge) limited to a very specific BTRFS layout (@ and @home).

/Jaybe

glb · October 17, 2024, 6:30pm

Just to provide a counter example, I just had a user experience a failure after a package update that they were not able to resolve with dnf history undo. Maybe this is an example where automated Btrfs snapshots could have saved the day (and made the troubleshooting much easier): Wifi disapered after installing updates

boredsquirrel · October 17, 2024, 7:20pm

So to brainstorm about implementations.

Currently Fedora adds grub entries per kernel version.

This mechanism would be needed turned off, so that there is no duplicate mechanism.

But having an older kernel for sure would be good.

Fedora Atomic Desktops dont save an image with an older kernel automatically, a single update with the same kernel version will remove it.

This is suboptimal I think.

Any ideas on how to tie kernel differences and keeping snapshots together?

glb · October 17, 2024, 7:31pm

I’d use the last installed kernel version as the version number for the snapshot and store it as part of the snapshot name (e.g. Fedora-Linux-6.10.11-200.fc40.x86_64). Everything you need to regenerate the boot entry on the ESP should already be in the snapshot (the kernel is stored at /usr/lib/modules/<kernel-version>/vmlinuz and the initramfs can be regenerated via something like chroot <path-to-snapshot> /usr/bin/dracut <path-to-esp> <kernel-version> if need be). I think it is just a matter of writing some scripts and a dracut module to store them in the initramfs by default so that they are available from dracut’s rescue environment. You could also set things up so the scripts could be triggered via a kernel parameter like btrfs.rollback for convenience.

Edit: Actually, the kernel-install script will take care of regenerating the initramfs and boot loader snippets, so you could probably just run chroot <path-to-snapshot> kernel-install ... if you need to. But you would only need to do that if the kernel for the snapshot that you are reverting to doesn’t already exist on the ESP. I expect that will typically not be the case and restoring things on the ESP will not be necessary.

Edit2: It looks like you’d have to add the chroot command to the initramfs for that to work. It is only 45K.

joesh-00 · April 15, 2025, 7:01am

I had some issues with not working combinations of kernel versions and nvidia driver versions, and while downgrading the kernel is possible via grub, dnf etc., downgrading nvidia drivers is not. This caused some major pain trying to get things working again.
So at least for anyone using nvidia drivers, having the possibility to snapshot/restore would be a real boon.

rokejulianlockhart · April 26, 2025, 3:12pm

It’d be good if there were an issue tracking this on RHBZ, Pagure, GL, or GH. As it is, this is merely a discussion. However, if the BTRFS SIG are genuinely interested in enabling Snapshot support, there should be a central tracker actually furthering this effort.

emanuc · April 26, 2025, 3:54pm

The Fedora Btrfs SIG has made some proposals:

Issue #18: Support simple configuration of system snapshotting with full system rollback support - project - Pagure.io
Issue #12: Integrated backup and restore - project - Pagure.io

Topic		Replies	Views
Is there a “plan” to integrate BTRFS snapshots in Fedora ( that appear in bootloader / grub )? Project Discussion workstation-wg	4	2906	May 4, 2021
Is it worth it for me to switch to Fedora in the future? Ask Fedora backup	4	1157	February 24, 2020
Btrfs Snapshots and Backup Solutions The Water Cooler tech-talk	3	1803	May 19, 2024
Fedora 33 snapshot management on Btrfs Project Discussion workstation-wg	2	2346	September 8, 2020
Getting snapper/btrfs-assistant to work with dnf5 Ask Fedora dnf5 , btrfs-assistant , snapper	8	2170	February 13, 2025

Dnf + BTRFS Snapshots: what stands in the way?

Related topics