With regards to copying folders/files within the same hard drive or within the same system onto different hard drives, what is the recommended way to do this for large amounts of files of high file sizes.
Is there a Linux equivalent to robocopy or similar variant?
My concern is corrupted files and large wait times to do the transfers.
right thanks @dalto so that should work for local transfers ? I think rsync was what timeshift uses and snapper however for simple transfers I was wondering if that would work?
Reason I ask is often with character limits set by the system it often isnt able to read all the file paths and file names. Further, the files are large and often need to be read before it can be transferred.
What is the recommended process for enterprise/NAS size transfers ? I mean I’m sure they dont just copy and paste the files for local transfers, is there a recommended transfer procedure/standard practice?
Timeshift has two modes, btrfs and rsync. In btrfs mode, it takes snapshots, in rsync mode it uses…rsync.
Snapper doesn’t use rsync, it takes btrfs snapshots.
For me, rsync is generally the best way to copy files. We use it in the enterprise regularly.
There are lot of ways you can use rsync and it can do a lot of different things depending on what flags you pass it. Many consumer NAS devices also support the rsync daemon on the NAS side which is one way to have an rsync target.
If you provide a more specific use case, we could maybe make a more detailed recommendation.
It does supports transfers over ssh and a whole bunch of other methods. That is simply because it is a flexible tool that can do a lot of different things.
As far as I know rsync isn’t capable of working with filesystem snapshots directly.
There are tools that use rsync to create what they call “snapshots”. However, what they are doing is using rsync to copy files which have changed and creating hard links for the files which haven’t. They then calls the resultant directory of data a “snapshot”. However, this is totally different than a filesystem snapshot such as what btrfs/zfs/etc have.
That is referring to transfers within the same filesystem. That is not what your use case is. You are transferring data across filesystems(and devices).
You certainly can do that but I don’t really see why it would be better than rsync for your stated use case. Especially since rsync can more reliably restart a failed transfer.
cp -rp will recursively copy a directory/directories and all their contents while attempting to preserve some permissions if you preference is to use cp
To replace a disk, I cannot speak more highly of btrfs replace. I once moved a whole disk (root filesystem and home) on a little Pi server from a write-weary SD card to a new flash drive with a single command while the system was running, and the thing never batted an eye. I was just so impressed! It felt a little bit like open heart surgery, but it worked.
I did want to mention how cool btrfs replace is, but I can’t agree more with @dalto as far as rsync is concerned. It’s an absolutely incredible tool for file management. I use it for making backups of course, but I also use it for transferring files to the headless Pi server I mentioned above. Once I even transferred all the photos off of my phone onto my computer over wifi through an SSH tunnel–using rsync, of course.
That last example turned out to not be a super practical way to do the task–I more just wanted to try it out just for the heck of it.
Thanks so much for explaining that and correction. Okay so have you ever done anything like this before? I am searching through the wiki link and looking for an example.
I think cloning a local directory would be the right approach is that true? I am wanting to move the files back and forth locally to different hard drives, large number of files, directories and paths.
There is a BIG difference between cp and rsync.
If you use cp it copies the entire file, every time, so may have a large delay in completion
If you use rsync it checks the source and destination. If the file exists in the destination then it compares the source and destination file and if there is no difference it skips to the next file. Thus it “syncs” the source and destination content while only copying new or changed files. A major saving in data copy & write time for backups.
Your thought that it is only usable for remote copies is not correct, in that it is capable of lots of more things than cp and is also much more flexible. AFAIK there is no limit on file name length, files sizes, or anything else. In fact if you tell it to copy the content of a directory it does just that, including the dot files that cp often ignores. Another little bit to note is that if you give it a destination path that does not exist it will create the entire path for you if needed.
The man page tells you a lot about rsync and it is my tool of choice whether copying a single file, a directory tree, or an entire content of a file system. I regularly use it to backup the content of my home directory (~ 3 TB) to an external drive.
If you are starting out, are not so familiar with partitioning and filesystems, and want to have a simple, solid solution I would definitely recommend rsync over filesystem based snapshotting and send/receive. It simply works on files on your filesystems.
However, as already mentioned, rsync is a very flexible tool, and there are a few things you need to understand to avoid “gotchas”. My list off the top of my head:
The trailing slash of the path has significance. There is a difference between for example “rsync -r /home /backup/home” and “rsync -r /home/ /backup/home/”. I prefer using the second, it explicitly says to sync the contents below /home to the contents below /backup/home.
For a backup, you probably would like control carefully whether to only sync one device/filesystem and not follow mount points/symlinks.
If you are concerned with corruption at the recipient side, you can run a VERY SLOW bit-by-bit verification when running your next sync - however, usually the default “size & time-stamp based” check should be sufficient.
Pay attention to warnings as given in the documentation (man rsync), like for example “–inplace”. Trying to save time, could have bad consequences, depending on your use case.
EDIT: And, as others have also pointed out, rsync works by mirroring between two locations. To turn it into a proper backup where you can “go back in time”, you would need to create versioned/dated directories at the recipient side, so that the next sync doesn’t overwrite the previous synced data you would like to keep. That’s why people use backup-tools which often utilise rsync as for the actual file sync.