I want to create a partition / subvolume to hold my personal data

Hi.

I have an external 1TB drive with about 800GB of personal data such as document, photos, lots of FLACs and movie files. The external drive is split into 10 different folders (FLACs, Movies etc). The drive is formatted with NTFS file system (this is a legacy of using Windows in the past).

I have just bought a 1TB SSD for my laptop and installed Fedora 41 using default settings. I want to integrate the data on this 1TB drive into my system so I can access it easily.

In the past I would have created a partition but now I see people using subvolumes. I have looked at subvolumes but feel totally confused by them. Ideally I would like to mount this data somewhere e.g. /mnt and have full rw access to it as a folder in my home directory or something similar like that. Then possibly back it up to an external BTRFS formatted HD.

Can someone point me in the direction of how best to add this data to my laptop using a subvolumes?

Thanks.

I am not sure I am following what you have done and what you are trying to achieve. Do you want the files to remain on the external drive and you simply want to make the data available at some mountpoint? Do you want to copy it to the internal SSD and are unsure where to put it?

In general, Btrfs subvolumes are separate namespaces inside a Btrfs partition, which means (among other things) that they can be mounted and snapshot individually (use the option subvol=[name] or subvolid=[id] to mount a subvol). At the same time, the subvolumes share the storage of the Btrfs partition. You can read more about them in the official documentation.

For example, on this laptop, I have a subvolume fedora and a subvolume home, which are mounted to / and /home. This way, I can easily create a new subvolume if I ever want to reinstall Fedora or another OS and simply mount the home subvolume to /home again. And because the Btrfs partition uses the whole SSD, I do not have to think about how much space to allocate for the OS and how much for the user homes.

Buy an external enclosure and plug it in.

Or install it in your laptop and edit your /etc/fstab as it is already formatted.

Hi.

I want to copy the data to the internal SSD.

Since the external drive is NTFS it would be much better to transfer the files to a linux file system. That is what I am understanding the intent is with the initial post.

Simplest way I know would be to do the following and it would maintain the current directory structure without serious manipulation.

  1. create a subdirectory in your home directory.
    Something such as ~/media would work nicely.
  2. attach the external device. It probably would be mounted somewhere such as /run/media/$USER/<device name>
  3. copy everything from the external device to the newly created subdirectory.
    sudo rsync -av /run/media/$USER/<device name>/ ~/media/
  4. You now have everything contained under the subdirectory ~/media and can access it directly. The external device can now be repurposed for whatever you choose; Including the potential to reformat it as a linux file system and return all that data to the external device if you choose.
1 Like

Thanks.

So I don’t need to use subvolumes at all? Just create a directory in my home folder and copy everything over? If I wanted to backup this ~/media folder should I just use rsync to copy to an external HD?

1 Like

And this is conceptually your own data, not something that is shared with other users of this machine (say, a family’s shared music library, e.g.), correct? IMHO, a subvolume only makes sense if you want to keep files somewhat separate, for different reasons like having its own mountpoint, being able to snapshot the data, etc. If this data is something that multiple users need to access, a separate subvolume with a mountpoint outside of /home/ might make sense.

If this is your data, I suggest you simply copy it to your home directory. That’s where a user’s data is supposed to go on a Unix-y OS. You can either maintain the existing directory structure or you could place it in appropriate directories in your home: the FLACs could go into ~/Music, the movies into ~/Videos, etc.

Hi.

I bought an internal 1TB SSD in order not to use an external HD. Thanks for you input anyway.

I agree with Lars :100:

And yes, a backup would be as simple as using rsync again to copy the data back to the external drive.

The answer to this depends on what the backup should guard against and how you set it up.

A simple rsync clone of a directory in a second location provides some protection but it is also vulnerable to certain problems. E.g., if your original data is damaged by something like bit-rot or an unclean shutdown (less likely with Btrfs and its checksums or journaling filesystems) or overwritten (either by you or some ransomware without you noticing), rsync will happily copy the corrupted data to the secondary location. Unless you have additional measures in place, both copies of your data are toast at this point. (rsync can protect against this, by putting each backup into a new time-stamped directory and hardlinking to the previous directory if files are unchanged. But you need to set this up and I doubt many people use it like that.)

I suggest you look at a dedicated backup program that maintains a timeline of your data, such as restic, Borg Backup, Kopia, … In addition to backups locally, these also support various cloud storage options for a 3-2-1 backup strategy.

Normal. There’s a lot going on. Just start with the basics, and build up. There’s so many features, I’m not sure anyone knows them all. :sweat_smile:

An oversimplification of subvolumes is they are directories that can be snapshot. Like a directory, subvolumes share the space available to the file system.

They’re not really separate file systems themselves, but it’s reasonable to call them dedicated file b-trees, since that’s what they are on disk. And why it’s cheap to snapshot them, and how quota accounting works. Quotas and snapshots are not a default behavior, they don’t happen without you explicitly opting in.

What is a snapshot? Short for subvolume snapshot. Therefore a snapshot is a pre-filled subvolume, it has files in it already, same as the subvolume you’re snapshotting.

One of the benefits of read-only snapshots is it’s really cheap to incrementally replicate them from one Btrfs to another because most of the tracking work is done just by nature of how Btrfs works. It’s usually faster and cheaper than other methods (like rsync).

Just because you’re using Btrfs doesn’t mean you have to use Btrfs on the 2nd device. You can use any backup utility you’re familiar with (and have tested a restore procedure for).

What I like about Btrfs snapshots and send/receive, is the replication is cheap. If I change a few files out of 100K, it takes just a few seconds for the incremental backup. Btrfs can do this quickly because no deep traversal is needed on either original or destination file systems to know what files have changed. Another thing I like is the replicated snapshot is a complete snapshot (has all files, same as origin) and it behaves just like a directory of files, no special tools needed for restore. Yes I’d probably use btrfs send/receive to replicate from backup to a new drive, but I don’t have to do that. I could use rsync. Or even put the backup snapshot on a Samba share and copy it out by smb commands, or through Nautilus or whatever.

OK so in your case, 10 different folders may become subvolumes. Or you might think of a high level backup frequency or method policy. And organize those ten directories into 2, 3, or more subvolumes that represent some higher level policy or strategy.

Like for me, I have a subvolume “most” which is most of my crap :joy: and another that’s “debris” which is crap I do not care about losing but so long as I have space I’ll keep dragging it along for the ride. There’s also “finance”. And those form replication policies, how frequently they get backed up but also how many backups and where.

My general thought on data is to have enough backups that I’m not wasting time on repair, data recovery or scraping attempts. Just get a new drive and replicate what’s missing per the policies I’ve established.

One thing that’s a plus minus about Btrfs. It has few guardrails. It doesn’t care how you organize things. It doesn’t tell you what you have to do. It’s super flexible. So that means you need to create the guardrails with organization and policies.

I’m gonna stop here.

It’s not required. You can use whatever tools you want for backup restore.

And even if you later decide to use a btrfs send/receive based backup/replication strategy, you can add the subvolume later. Just cp -a the files from directory to subvolume, and cp will even automatically use reflink copies. [1] And then you can remove the original dir, and rename the subvolume. It’ll behave like a directory for all practical purposes except you can snapshot it directly with btrfs subvolume snapshot.

You can checkout btrbk for automating some of these tasks.

[1] Therefore it’ll go way faster than you think it should (not as fast as a snapshot, it does still have to write a bunch of metadata, but the data extents will not need to be duplicated, Btrfs will just create new inodes pointing to existing data extents).

Yes and yes. Any backup is better to have than not have.

Anything Btrfs does for you with send/receive is a performance enhancer. If you’re more likely to backup more often because it’s much cheaper than rsync, then consider btrfs send/receive but not at the expense of delaying an rsync (or borg or whatever you’re familiar with) based backup system.

For years I had an rsync to XFS backup in addition to multiple Btrfs backups all using send/receive. Advantage is I’ve got the security blanket of the familiar, not all eggs in one technology basket, but I’m still able to learn and eventually migrate to a more efficient process.

These days I’m only doing btrfs send/receive to multiple backup drives, and even one backup image that I then upload offsite periodically.

Hi.

After the reading all the great advice I think I know what I want (but it could be wrong!). Lars said I really should be keeping all my own data in the /home folder and I somewhat agree.
But when it comes to my /home folder I tend to be a bit lazy and over time it becomes a bit of a mess, so I would move essential files I wanted to keep permanently to an external drive such as music, movies, game patches, books etc. This drive was separate from the OS drive.

Thinking about it I would like to keep something similar in place but instead of an external drive I would use a part of my internal SSD for this duty. I am wondering could I permanently mount a subvolume at /mnt and have access to it from my /home folder? Then possibly snapshot this subvolume to an external drive? Or is that silly?

Yes. It is perfectly valid to save your files in a folder outside of your home directory. (You can even create new directories at the top-level of your filesystem, e.g. /games, etc. You don’t have to work within the existing directories such as /mnt or /srv which have their own purposes.) The main difference to be aware of is that the “inherited” permissions will be different (both DAC and MAC). You’ll need to use sudo to get the permissions set up correctly at the top level of your new directory hierarchy, but then everything should work fine.

You can find more info about subvolumes here: Working with Btrfs - Subvolumes - Fedora Magazine

P.S. You probably do want to use a subvolume if you store your data files outside of /home, just in case you someday need to revert your root filesystem to an earlier snapshot. You wouldn’t want to revert your data files by mistake.

1 Like