I found out that I can use rsync with checksum for this, but I don’t think Fedora does it by default while copying. Can anyone confirm this? And if it does not check the integrity, why is it so? Somehow can I enable this from any file settings?
Some file systems have checksums, like the default btrfs on desktop installations.
But to copy a file the data is read into memory and then written out.
Unless you have ECC RAM then the data can be corrupted in memory.
Even rsync cannot detect this type of issue.
So it doesn’t matches the checksum of source and destination after copying?
How are you copying the files?
Hello @sad-truant ,
On Fedora systems which are workstation oriented, ie not a server installation, BTRFS is the default file system Welcome to BTRFS documentation! — BTRFS documentation. It is a journaling file system so yes data is verfied, but it also uses Copy on Write (CoW) to copy data. This means that when making a copy, initially this is only a creation of new meta-data of the file system, that is a new inode in the b-tree, pointing to the existing data, so a reflink copy if you will. But the data itself is not copied to a new location unless you specifically tell it to. This is all using the cp
command. If you are talking about rsync, then a comparison (to the underlying filesystem) is not really appropriate since it (rsync) is not a file system but a stand alone application for copying and transferring large amounts of data.
Not unless cp knows how to tell the file system to do the copy.
I do not know if it does or not.
Btrfs does not use journaling for filesystem consistency verification, but instead utilizes “COW” (Copy-On-Write).
Data and metadata are checksummed by default, the checksum is calculated before write and verified after reading the blocks from devices.
Data checksumming is one of the reasons why Fedora has switched to Btrfs by default.
Storage devices can be flaky, resulting in data corruption
Everything is checksummed and verified on every read Corrupt data results in EIO (input/output error), instead of resulting in application confusion, and isn't replicated into backups and archives
https://fedoraproject.org/wiki/Changes/BtrfsByDefault#Benefit_to_Fedora
It most certainly does keep metadata to verify against the data, the CoW function is for integrity of writes not data verification.
rsync does checksum the original and the destination AFAIK
cp does no verification that I am aware of.
I could be wrong, but it should be like this:
Filesystem changes are handled in atomic transactions. If a transaction fails halfway through, Btrfs can revert the filesystem to its previous state without leaving it in an inconsistent state.
Therefore, filesystem consistency in Btrfs is ensured through techniques such as copy-on-write, checksums, and atomic transactions, rather than through a traditional journaling system.
That’s the function of writing a data change, but a journal or a database is required to keep track of the locations (inodes) and the checksums, etc …
Verification is not what I’m talking about.
For example there is the splice() system call that could be used instead of a read(), write() loop. But would that help

Even rsync cannot detect this type of issue.
I’m not very well versed in this field, but from rsync man page: Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred

Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred
Does it read the file back off the filesystem after it is written and flushed to check it?

Does it read the file back off the filesystem after it is written
It does not need to. The checksum done at both ends confirms it is a valid copy

It does not need to. The checksum done at both ends confirms it is a valid copy
That is not true where there are intermittent hardware issues.
And that is exactly the reason why the checksums are needed.
Ctrl + C, Ctrl + V
Added btrfs