How to judge if a backup process is perfect?

I am asking as I am doing:

  • backup files in NTFS partition to external disk (w/ btrfs)
  • remove the NTFS partition
  • expand existing btrfs filesystem to use the whole disk
  • transfer files back

Following Which is the best tool for copying a large directory tree locally? · Mutable Ideas, I use the “tar” method.
Run this command:
find ${SOURCE} -type d
-exec /root/transfer-with-tar.sh {} ;

with this shell script:

#!/usr/bin/env bash

ORIGEN=$1
DESTINATION=/datadrive2

NEW_DIR=${DESTINATION}/${ORIGEN}
mkdir -p ${NEW_DIR}
(cd ${ORIGEN}; tar cf - .) | (cd ${NEW_DIR}; tar xpf -)

echo ${ORIGEN}

It is now 2.4/2.5 TB done.

I want to know how to judge if the backup is OK, so that I can remove my NTFS partition.

2 Likes

Performing file-level backup, you risk losing metadata:

  • File ownership, permissions, ACLs
  • SELinux labels
  • Extended attributes, capabilities, etc.

I’m not sure whether tar backups are affected by these issues and to what extent.
So, you should be careful, especially if you plan to back up system directories since losing critical metadata might make your system inoperable.

Depending on the data type and size, block-level or filesystem-level backups might be preferable.

In general, only a verified copy on a reliable separate storage can be considered a good backup.
Ignoring the metadata-related issues, I personally prefer rsync backups since it’s easier to update, very and restore.

2 Likes

Thank you for your insights.

The files are mainly JPG / MP4 taken by mobile phone.

In my case, it is from NTFS (internal) → btrfs (external) → btrfs (expanded internal), the meta data I care are likely:

  • file create time

  • file modified time

  • do I miss any file? or “Are the source and target directory trees the same?” (not included in the COPY, error reading, error writing, etc.)

  • are file contents identical?

From btrfs (external) → btrfs (expanded internal), I will use btrfs send/receive. So I am worried about the file level backup from NTFS → btrfs part.

2 Likes

Then I recommend to use rsync:

rsync -a -N --progress src_dir/ dst_dir/

You can simply repeat the command to verify the result, optionally adding the key --checksum.

2 Likes

Only 25GB to go for the “tar” process.
As I see some error message: (likely due to spaces within filenames)
transfer-with-tar.sh: line 8: cd: too many arguments

I definite will re-run your suggested rsync commands after I taken a snapshot of the “tar” outputs.

2 Likes

Errors in your script?

Should be:

NEW_DIR=${DESTINATION}/${ORIGEN}
mkdir -p ${NEW_DIR} $(cd ${ORIGEN}; tar cf - .) | $(cd ${NEW_DIR}; tar xpf -)

I’m pretty sure the 2nd line is wrong too. Maybe:

mkdir -p ${NEW_DIR}
cd ${ORIGEN}
tar cf - .
cd ${NEW_DIR}
tar xpf -

Not sure exactly what you want to do.

1 Like

This “tar” method, as mentioned in the linked web post, is provided as one of the test case for speed comparison with rsync, cpio and rsync+parallel .

File copying is so important to done right, and I will use this chance to learn more about tar, find and shell command and scrips.

I will try your suggestions against those giving me “cd: too many arguments” warnings and to learn the details.

Thank you very much!

In every instance where I have used rsync it handles file names with spaces, parentheses, and other special characters properly, keeping the file name intact. I find that cp, tar, and the like do not.

To see the differences simply run the “ls” command in a directory that has file names with spaces or other special characters and you will note that those file names are surrounded by double quotes while file names that are linux/unix standard (no spaces or special characters) do not show the quotes.

Since windows has never been linux/unix compliant with respect to file and directory naming, the errors you see with tar often crop up when transferring files between the two systems. Rsync has been one of the tools designed to transfer files between systems without errors in file naming.

1 Like

$ sudo rsync --inplace --info=progress2 --no-whole-file -a ntfs/ btrfs/

reports nothing, so I assume the “tar” process indeed copied all files, so those warnings are just warnings.

I am doing one more rsync pass with --checksum before I will remove the ntfs/ partition.

1 Like

After this round of transfer of 2.7TB of files, I will stick with rsync for file level transfers from now on.

A few questions:

  • how to prevent spase qcow2 files become fully allocated after rsync?
  • when the file is not exist in target
  • when the target exist in target but needs update
  • how to make rsync work nicely with COW btrfs filesystem? So far, I googled and used this:
    rsync --inplace --info=progress2 --no-whole-file -a source/ btrfs/

I so far only run rsync interactively, I would like as much progress indication as possible, like which file is being compared, which file is being transmitted, how much changed of each file, etc. With --info=progress2, I only got this one line staus:
0 0% 0.00kB/s 0:00:00 (xfr#0, ir-chk=2025/18024)

What parameters can provide such progress indications while rysnc is running?

2 Likes

I think these followup questions might be good candidates (individually, even!) for this process:

That way, you can mark the solution for each and provide a reference for future users.

2 Likes

Hope I am doing it right.

https://discussion.fedoraproject.org/t/rsync-sparse-files-cow-target-and-want-more-progress-indicator/77795?u=sampsonf

1 Like