I have been using cp to back up my home directory by only copying new and updated files into the backup copy but this no longer works.
cp has a silly feature cp dir1 dir2 will copy dir1 into dir2 as long as no directory named dir1 already exists in dir2 but if such a directory exists already dir1 will be copied as a subdirectory of the dir1 in dir2.
I got around this problem by setting the working directory to dir2 and running command cp -pruv /home/Carl but now I get a message saying that a destination operand is require. If I add dir2 as a destination and dir1 already exists I get dir2/dir1/dir1
It means to do a backup I must delete my existing backup and copy my entire home directory which takes a long time. The -u option to avoid unnecessary copies of files that have not changed does not work.
Is there any way to use cp to update a copy of a directory? I can’t see any and if my work around only worked because of a bug it was a benign bug.
The command rsync can easily do what you are trying. Reading the man page and looking at the examples there will assist.
If as you show you wish to only copy the changed content of dir1 to dir2 then the command using rsync would be something like rsync -av dir1/ dir2/ and the only files copied would be the new or changed files. Cp on the other hand copies everything, which results in copying a lot more data than just new and updates.
Note the trailing / on each of those directory names. It tells rsync to copy the content of dir1 to dir2, while if it were structured as 'rsync -av dir1 dir2/` it wold copy the directory dir1 and its content to a subdirectory of dir2, leaving dir2/dir1/ as the result.
cp -uvp dir/* .
should do, if I understood your vague description.
In general one should use cp -a or, better yet, rsnapshot for backups,
tailored for this kind of task.
I don’t think that cp directoryname without a third option has ever worked, at least not in the cp shipped with Fedora Linux (or RHL before that!). By “setting the working directory”, do you mean via cd or by some other means? (There may be some trick I wasnt’ aware of!)
That said, I second the choice of rsync for doing this. It’s made for the job.
Thankyou for your reply.
I can see that periodic use of rsync to clean from the source directory that have been deleted from the backup directory would be useful but using it as standard would risk loosing desired content if an HDD error destroys content in the backup. Normally I am more concerned with saving content that has changed rather than getting rid of crap of which I have a huge amount including multiple Linux system iso for many releases of Linux distributions. A full backup takes enormous amounts of time and uses up most of the resources of my laptop so attempting to do any other useful work is futile but my desire is to merge changed files rather than getting rid of crap. Once the crap has been backed up by a full copy an time taken by an incremental backup is tolerable except for some of the files in hidden directories. I notice in the verbose listing considerable stuff in some directories unde .local including the share/trash directories and in a .cache directory.
I have confirmed that your suggested method works with copying files from a directory with only one layer of files to another one with one level files, I will extend your example to try merging changes in a multi-layer system of directories to another multi-layer system before attempting to modify my backup script.
However in trying to do so I am getting more confused and am no longer sure that I can remember the exact things I tried and in what order.
Initially I think I set up a home partition on a terabyte+ HDD and under it directory hierarchy Backups/md-tower-001 and as a third level
Backups/md-tower-001/Carl.
When that HDD is inserted in a USB the full directory paths are /run/media/Carl/UUID/Backups/md-tower-001 and
/run/media/Carl/UUID/Backups/md-tower-001/Carl where UUID is the uniformly unique identifier of the ext4 logical drive a 128 bit binary number expressed as 32 hexadecimal digits.
I think my first attempt involved the command
cp -apruv /home/Carl /run/media/Carl/UUID/Backups/md-tower-001/Carl where the target directory
/run/media/Carl/UUID/Backups/md-tower-001/Carl at that time did not exist. It worked as a full backup but the next time I ran the same cp command /home/Carl was copied into
/…/Backups/md-tower-001/Carl/Carl as a full copy when I wanted it to merge updated files only into /…/Backups/md-tower-001/Carl.
This behaviour absolutely mystified me until I read the section on cp in William Shotts extremely useful “The Linux Command Line” . This explicitly stated that if the if the target directory did not exist it was created but if it did exist a new subdirectory of the same name was created in the target directory. I reasoned that doing subsequent copies
into a target of /…/Backups/md-tower-001 might result in a merge
and used a command
cp -apruv /home/Carl /run/media/Carl/UUID/Backups/md-tower-001 should work but I do not know whether I tried this instead I used cd to make /run/media/Carl/UUID/Backups/md-tower-001 the working directory and used command cp -apruv /home/Carl with no second operand and it worked. I incorporated into a crude backup script and for a while used it. Later I decided to modify the backup script to incorporate more functions but found that I did not know enough about bash scripting and that my backup script was corrupted and I had not kept a backup copy. It took me sometime to read enough of William Shotts book to create my new backup script but when I used it I got the message aboutthe missing operand and if I made md-tower-001/Carl the destination I got the full backup into a subdirectory of the desired directory.
It crosses my mind that cp may have had a bug involving not needing an explicit 2nd operand when the implied working directory was the target and that bug may have been fixed in the time I had been expanding my Linux knowledge from w Shotts book. I will shorten the working directory to end with Backups and use cp -apruv /home/Carl md-tower-001.
Actually the thinking on the problem involved in drafting this reply turned out to be valuable.
You may have misunderstood something, but rsync can do both full and incremental backups.
Obviously, incremental backups are usually much faster because you don’t have to overwrite unmodified files.
Just specify the -t or -a switch to preserve modification times, and next time rsync will copy only modified files.
This is a well known, highly reliable and widely accepted backup solution.
It certainly is confused. First of all, cp -a implies -pr already, the verbose v slows down things. You will never read its output anyway, you want a log file for that.
Once again I invite you to adopt rsnapshot instead
you could find plenty of information and use cases.
For the one familiar with time machine of OSX, it is similar, only better and does not need a graphical interface. If you want one, consider backintime https://backintime.readthedocs.io/en/latest/
or deja-dup
Of course there are many more backup solution widely adopted and tested.
There is really no reason to reinvent the wheel, poorly.