Gnome Backup aka Déjà Dup is sooo slow

Hello!

I am using Gnome’s backup tool called Déjà Dup, which is a front-end to duplicity to backup my ~. What I like about it: the backup is encrypted, it’s incremental, and it’s automatic.

In theory this works but it is horribly slow - it takes hours or sometimes half a day to backup 500 GB data from my home dir.

When do a plain copy of a large file from my machine to the backup server on my local lan, I get something 110-115 Mb/s, which I consider normal.

I understand that duplicity does a lot more than just copying files over the network (scanning, indexing, gpg encryption, …), but it’s not even really demanding my CPU, there seem to be a lot of available CPU time. Is is this simply a limitation of Déjà Dup or duplicity? Or are there any tweaks or settings I can adjust to let the thing perform better?

I have tried nfs and sftp protocols so far, that didnt make a difference and I really doubt network is the bottleneck here.

What could be done?
What do you use for backups?

edit.
I can see in my system monitor that it takes about 10 seconds to transfer the 200MB volxxx.difftar.gpg files that Deja Dup creates. Speed is about 25MB/s, so that is a lot slower than a plain copy over nfs.

edit2.
here is a chart when copying back one of the 200MB files from the server to my machine using nfs in nautilus. It’s a matter of 2-3 seconds, and it reaches somewhere around max gigabit ethernet speed.

grafik

1 Like

Removed gnome, server

It’s worth to try other backup solutions and protocols.
That should at least help you find the bottleneck.
Personally, I’m happy using rsync over SFTP.

3 Likes

What is the performance info on the server when duplicity is running?

If lots of small files and be read then thst slows things down compared to large files.

3 Likes

I think compression makes a big difference and Deja Dup uses compression.

It’s like running rsync versus rsync --compress.

Can Deja Dup use more resources and speed up compression… don’t know. May be worth asking the author.

Server performance is a good point, however, as OP said, Deja Dup transfers chunks of 200MB files ( .difftar.gpg) , so many small files is often more an issue with rsync.

1 Like

I think that what I will do in the future. Do you use the -z for compression? Does it make it slow? Or do transfer without compression?

Good question, don’t really know, I can repeat the backup later and check.

That is my impression as well… compression and encryption

I am wondering if it takes that long for the initial backup, or for the incremental backups as well? Do you have large files that get changed often?

I am using Deja-Dup backups as well, backup location being my Google account, but obviously for a smaller dataset.

1 Like

Very good points!! I have a few VMs with virtual disks between 15 and 60 GB, so with a tiny change in the VM the entire file is compressed, encrypted, and transfered.

To avoid this, I have added the storage location for virtual disks to the exclude list, and decided to manually backup those

I will do another backup tonight and see if that second run is faster (I started fresh with a full backup yesterday).

Looking forward to your results once you take those VMs out of the backup.

Couldnt wait until tonight and just did a little test, with ~/VirtualBox VMs and .local/share/libvirt/images excluded from the backup. Went to drink a coffee and when I came back, maybe 10 minutes after, the backup was done.

So, in essence, the issue are the large VMs that are transferred entirely every time…

3 Likes

With my workflow and incremental backups, it usually takes a few seconds to transfer the changed and new files, so I don’t really care about compression.

Moreover, there’s basically no point using compression when dealing with images, audio, videos, etc. as that would be just a waste of time and CPU resource.

1 Like

Is the 200MiB chunk staged completely on the client or written bit by bit as deja dup runs? That will make a big difference to performance.

I’m honestly not very familiar with Deja dup, l based my statement of 200 MB chunks on upload characteristics in OP‘s graph. Looks like 1. (compression + encryption), then 2. Upload. No network traffic in between.

I could be wrong - just guessing.

Sorry for ugly post, this is from my phone while cooking.

2 Likes

I wrote myself an rsync backup script for incremental backups - this way I can control what’s happening rather than relying on Deja Dup Backup that hides everything under the hood and doesn’t allow to modify any settings.
Thanks everyone for their input.

1 Like

I use duplicity directly and that allows for lots of settings as command line options. In my case the backup is run as a systemd service.

1 Like

I started “experimenting” with Deja Dup maybe a year ago. I didn’t back up the Flatpak config, and I had trouble restoring a backup. Since I also do an rsync backup, it wasn’t an issue.

Would you mind sharing your duplicity command? I am happy to do so too for my rync in case anyone is interested.

By the way, my rsync backup operates at 25 MB/s only. That is less than a quarter of the throughput I get when transferring a large file over from the server, see my first post. So, I haven’t figured out why this is so much slower. CPU load on the server is about 0.6

This is the service that I use to backup my user account over NFS to my Fedora file server.

I use remove-all-but-n-full to limit the number of full backups that are stored.
Then I do an incrementals unless its more then 7 days old with incremental --full-if-older-than 7D.

There is a timer unit that runs once an hour to do a backup.
If I’m busy working on a project I can get to old versions of work that are not yet in git for example.

$ systemctl cat backup-barry.timer
# /etc/systemd/system/backup-barry.timer
[Unit]
Description=backup-barry.timer

[Timer]
OnBootSec=15 minutes
OnUnitInactiveSec=1 hour

$ systemctl cat backup-barry.service
# /etc/systemd/system/backup-barry.service
[Unit]
Description=backup-barry.service

RequiresMountsFor=/shared/BackupWorthy

[Service]
Type=oneshot
TimeoutStartSec=0
ExecStartPre=/usr/bin/duplicity \
    remove-all-but-n-full \
    1 \
    --force \
    --name worthy-barry \
    file:///shared/BackupWorthy/backup-barry
ExecStart=/usr/bin/duplicity \
    incremental \
    --full-if-older-than 7D \
    --no-encryption \
    --name worthy-barry \
    --exclude /home/barry/.cache \
    --exclude /home/barry/.local/share/akonadi/file_db_data \
    --exclude /home/barry/.local/share/Steam \
    --exclude /home/barry/.ccache \
    --exclude /home/barry/tmpdir \
    --exclude /home/barry/rpmbuild.d \
    --exclude /home/barry/Downloads \
    --exclude /home/barry/MeetMe \
    --exclude /home/barry/SharedMeetMe \
    --exclude /home/barry/fender \
        /home/barry file:///shared/BackupWorthy/backup-barry

I also backup /etc so I have an config changes that I’ve not yet put into SVN (where I keep my machine config files).

$ systemctl cat backup-etc.service
# /etc/systemd/system/backup-etc.service
[Unit]
Description=backup-etc.service

RequiresMountsFor=/shared/BackupWorthy

[Service]
Type=oneshot
TimeoutStartSec=0
ExecStartPre=/usr/bin/duplicity \
    remove-all-but-n-full \
    2 \
    --name varric-etc \
    --force \
    file:///shared/BackupWorthy/backup-etc
ExecStart=/usr/bin/duplicity \
    incremental \
    --full-if-older-than 7D \
    --no-encryption \
    --name varric-etc \
        /etc file:///shared/BackupWorthy/backup-etc

Edit: And here is a sample of the journal log of an incremenal backup:

2024-06-13T09:04:37+01:00 systemd[1]: Starting backup-barry.service...
2024-06-13T09:04:38+01:00 duplicity[4519]: Last full backup date: Sat Jun  8 17:52:20 2024
2024-06-13T09:04:38+01:00 duplicity[4519]: No old backup sets found, nothing deleted.
2024-06-13T09:04:38+01:00 duplicity[4541]: Local and Remote metadata are synchronized, no sync needed.
2024-06-13T09:04:38+01:00 duplicity[4541]: Last full backup date: Sat Jun  8 17:52:20 2024
2024-06-13T09:06:01+01:00 duplicity[4541]: --------------[ Backup Statistics ]--------------
2024-06-13T09:06:01+01:00 duplicity[4541]: StartTime 1718265878.66 (Thu Jun 13 09:04:38 2024)
2024-06-13T09:06:01+01:00 duplicity[4541]: EndTime 1718265958.77 (Thu Jun 13 09:05:58 2024)
2024-06-13T09:06:01+01:00 duplicity[4541]: ElapsedTime 80.11 (1 minute 20.11 seconds)
2024-06-13T09:06:01+01:00 duplicity[4541]: SourceFiles 515074
2024-06-13T09:06:01+01:00 duplicity[4541]: SourceFileSize 55080549731 (51.3 GB)
2024-06-13T09:06:01+01:00 duplicity[4541]: NewFiles 16
2024-06-13T09:06:01+01:00 duplicity[4541]: NewFileSize 562715 (550 KB)
2024-06-13T09:06:01+01:00 duplicity[4541]: DeletedFiles 2
2024-06-13T09:06:01+01:00 duplicity[4541]: ChangedFiles 55
2024-06-13T09:06:01+01:00 duplicity[4541]: ChangedFileSize 6717900299 (6.26 GB)
2024-06-13T09:06:01+01:00 duplicity[4541]: ChangedDeltaSize 0 (0 bytes)
2024-06-13T09:06:01+01:00 duplicity[4541]: DeltaEntries 73
2024-06-13T09:06:01+01:00 duplicity[4541]: RawDeltaSize 23029755 (22.0 MB)
2024-06-13T09:06:01+01:00 duplicity[4541]: TotalDestinationSizeChange 9840519 (9.38 MB)
2024-06-13T09:06:01+01:00 duplicity[4541]: Errors 0
2024-06-13T09:06:01+01:00 duplicity[4541]: -------------------------------------------------
2024-06-13T09:06:01+01:00 systemd[1]: backup-barry.service: Deactivated successfully.
2024-06-13T09:06:01+01:00 systemd[1]: Finished backup-barry.service.
2024-06-13T09:06:01+01:00 systemd[1]: backup-barry.service: Consumed 56.991s CPU time.

Because backing up VM images was mentioned…

I use Restic for backups. It does de-duplication even within files, so it works reasonably for backing up VM images incrementally, rather than storing lots of copies of the entire image.

Both Restic and Kopia are kind of like Git for backups. They were implemented at around the same time and with similar design. I haven’t backed up VM images with Kopia, but I understand that it would work in the same way as Restic for this.

1 Like