How to copy files safely in Fedora with BTRFS

Hi there,

With regards to copying folders/files within the same hard drive or within the same system onto different hard drives, what is the recommended way to do this for large amounts of files of high file sizes.

Is there a Linux equivalent to robocopy or similar variant?

My concern is corrupted files and large wait times to do the transfers.

2 Likes

rsync is probably what you want here.

The Arch Wiki gives some good examples and the appropriate flags to use in different situations.

As far as copying files, the fact that it is btrfs makes no difference.

However, if you want to copy entire subvolumes, you can use snapshot replication.

3 Likes

right thanks @dalto so that should work for local transfers ? I think rsync was what timeshift uses and snapper however for simple transfers I was wondering if that would work?

Reason I ask is often with character limits set by the system it often isnt able to read all the file paths and file names. Further, the files are large and often need to be read before it can be transferred.

What is the recommended process for enterprise/NAS size transfers ? I mean I’m sure they dont just copy and paste the files for local transfers, is there a recommended transfer procedure/standard practice?

1 Like

Timeshift has two modes, btrfs and rsync. In btrfs mode, it takes snapshots, in rsync mode it uses…rsync. :wink:

Snapper doesn’t use rsync, it takes btrfs snapshots.

For me, rsync is generally the best way to copy files. We use it in the enterprise regularly.

There are lot of ways you can use rsync and it can do a lot of different things depending on what flags you pass it. Many consumer NAS devices also support the rsync daemon on the NAS side which is one way to have an rsync target.

If you provide a more specific use case, we could maybe make a more detailed recommendation.

1 Like

Certainly, so basically for transferring very large (large for us anyway) say 250GB to 1TB worth of Data from one internal HDD (WD RED brand) to a SSD (samsing evo 840).

Both of these hard drives are internal and on a standard desktop machine.

Often what happens is the path extensions are too long as there are a lot of nested hierarchical folder structures and it starts giving errors. Hope that makes sense.

Also it reads all the files before it transfers and it takes a very long time

1 Like

I use rsync on more than 10TB of data regularly. If this is for backup purposes, it can also only copy changed data.

Since you referenced it in the initial post, you can think of rsync like a more powerful version of robocopy.

3 Likes

I could be wrong here but doesn’t rsync function as a data transfer tool purely for networking transfers like SSH. And it’s other main function is snapshots

I would have thought rsync is not viable for local data transfer.

I’m sure I’m wrong here though as you wouldn’t have suggested, I know I’m missing something in the literature wiki you sent I cannot see anything about local transfers

It also mentions it in the page you linked

Rsync for transfers

What about transferring files in the terminal through the “cp” command?

Does that command facilitate the same process as if we did a “copy to” in say nautilus?

1 Like

From my perspective, that is completely wrong. :wink:

It does supports transfers over ssh and a whole bunch of other methods. That is simply because it is a flexible tool that can do a lot of different things.

As far as I know rsync isn’t capable of working with filesystem snapshots directly.

There are tools that use rsync to create what they call “snapshots”. However, what they are doing is using rsync to copy files which have changed and creating hard links for the files which haven’t. They then calls the resultant directory of data a “snapshot”. However, this is totally different than a filesystem snapshot such as what btrfs/zfs/etc have.

That is referring to transfers within the same filesystem. That is not what your use case is. You are transferring data across filesystems(and devices).

You certainly can do that but I don’t really see why it would be better than rsync for your stated use case. Especially since rsync can more reliably restart a failed transfer.

cp -rp will recursively copy a directory/directories and all their contents while attempting to preserve some permissions if you preference is to use cp

3 Likes

To replace a disk, I cannot speak more highly of btrfs replace. I once moved a whole disk (root filesystem and home) on a little Pi server from a write-weary SD card to a new flash drive with a single command while the system was running, and the thing never batted an eye. I was just so impressed! It felt a little bit like open heart surgery, but it worked.

EDIT:

I did want to mention how cool btrfs replace is, but I can’t agree more with @dalto as far as rsync is concerned. It’s an absolutely incredible tool for file management. I use it for making backups of course, but I also use it for transferring files to the headless Pi server I mentioned above. Once I even transferred all the photos off of my phone onto my computer over wifi through an SSH tunnel–using rsync, of course.

That last example turned out to not be a super practical way to do the task–I more just wanted to try it out just for the heck of it. :joy:

4 Likes

Thanks so much for explaining that and correction. Okay so have you ever done anything like this before? I am searching through the wiki link and looking for an example.

I think cloning a local directory would be the right approach is that true? I am wanting to move the files back and forth locally to different hard drives, large number of files, directories and paths.

2 Likes

Thats awesome, is there any documentation on that?

1 Like

Perhaps. It depends why you are moving the files around. Are they backup copies? What is the point of copying them?

1 Like

My opinion: incremental backup of ZFS pool using ZFS send / ZFS receive. This will be much much faster than rsync.

btrfs apparently does that too.

btrfs send source | btrfs receive destination

Here is a nice article with examples: Incremental backups with Btrfs snapshots - Fedora Magazine

2 Likes

Yes they are backup copies. The filesystem and home folder are partitioned on the same hardrive and as a precaution I wanted know how to make copies/backups. (Still learning how to setup partitions).

1 Like

A post was split to a new topic: Btrfs create snapshot and send | receive to backup

I usually use rsync -avhW for that use case.

2 Likes

There is a BIG difference between cp and rsync.
If you use cp it copies the entire file, every time, so may have a large delay in completion
If you use rsync it checks the source and destination. If the file exists in the destination then it compares the source and destination file and if there is no difference it skips to the next file. Thus it “syncs” the source and destination content while only copying new or changed files. A major saving in data copy & write time for backups.

Your thought that it is only usable for remote copies is not correct, in that it is capable of lots of more things than cp and is also much more flexible. AFAIK there is no limit on file name length, files sizes, or anything else. In fact if you tell it to copy the content of a directory it does just that, including the dot files that cp often ignores. Another little bit to note is that if you give it a destination path that does not exist it will create the entire path for you if needed.

The man page tells you a lot about rsync and it is my tool of choice whether copying a single file, a directory tree, or an entire content of a file system. I regularly use it to backup the content of my home directory (~ 3 TB) to an external drive.

4 Likes

If you are starting out, are not so familiar with partitioning and filesystems, and want to have a simple, solid solution I would definitely recommend rsync over filesystem based snapshotting and send/receive. It simply works on files on your filesystems.

However, as already mentioned, rsync is a very flexible tool, and there are a few things you need to understand to avoid “gotchas”. My list off the top of my head:

  • The trailing slash of the path has significance. There is a difference between for example “rsync -r /home /backup/home” and “rsync -r /home/ /backup/home/”. I prefer using the second, it explicitly says to sync the contents below /home to the contents below /backup/home.
  • For a backup, you probably would like control carefully whether to only sync one device/filesystem and not follow mount points/symlinks.
  • If you are concerned with corruption at the recipient side, you can run a VERY SLOW bit-by-bit verification when running your next sync - however, usually the default “size & time-stamp based” check should be sufficient.
  • Pay attention to warnings as given in the documentation (man rsync), like for example “–inplace”. Trying to save time, could have bad consequences, depending on your use case.

EDIT: And, as others have also pointed out, rsync works by mirroring between two locations. To turn it into a proper backup where you can “go back in time”, you would need to create versioned/dated directories at the recipient side, so that the next sync doesn’t overwrite the previous synced data you would like to keep. That’s why people use backup-tools which often utilise rsync as for the actual file sync.

4 Likes

Not true with btrfs, cp is a reflink not actual data move just copy metatdata, so faster than traditional copy and less prone to corruption.

3 Likes

But only within the same filesystem. The OP is copying across filesystems.

2 Likes