I want filesystem-level snapshots for my SMR HDD, used mainly as a data backup and archival disk. Is BTRFS suitable for this job? Any alternatives?
I currently use rsync incremental snapshots (on ext4 partition), but BTRFS is more space efficient and covers the whole filesystem.
Edit: this is the HDD in question:
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue Mobile (SMR)
Device Model: WDC WD10SPZX-60Z10T0
Serial Number: WD-WXQ1AA8ENJ47
LU WWN Device Id: 5 0014ee 65e7e7f1b
Firmware Version: 04.01A04
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
TRIM Command: Available, deterministic
Device is: In smartctl database 7.3/5319
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
BTRFS is relatively new in the everyday world, and I am still observing. There have been cases where the file system was filled up then even though files were removed and space freed up the nodes were not available for use without running a cleanup process on the btrfs file system.
To me it still seems a little less reliable than ext4, though promising in the long run.
no, because it depends who you ask.
Btrfs is used in OpenSuse and, for example, at Facebook’s data centers for quite a while. If you ask there you will certainly get a different answer from when you ask people who have lost data on Btrfs.
Personally I would never recommend that BTRFS be used on an SMR drive.
There is a major difference in the data storage and any potential faults of btrfs will only be greatly exacerbated by the SMR technology.
I use only HDDs with CMR tech. They are higher priced, but speed and durability are greatly improved.
This shows that the data is in overlapping (shingled) layers, and writing to a lower layer requires removing the data from the overlapping layers, writing the lower layer of data then restoring the overlapping layers of data. It seems the top layer is written first, then lower layers added as needed.
Even for the most reliable of file systems it involves an excessive amount of time for this type of writing and each layer has a chance of data corruption as it is read then rewritten. The more data on the drive, the more this type of writing is required. The more layers used the less space available in each layer.
For example, if the tech uses 3 layers, then for a 1TB drive there is only about 330 GB available in each layer. Or for 4 layers then each layer would be only 1/4 the total drive space.
Yes, that is supposed to be handled by the drive firmware, but even the manufacturers only recommend those drives for home use and sell drives with CMR tech for NAS or data center use.
Thanks for the info. I’m using the disk as a backup and the amount of data I have isn’t large. The disk has been in service for 3 years and it’s SMART status haven’t reported uncorrectable errors, so I’m probably fine using it.
Edit: I switched back to ext4, rsync incremental snapshots should be enough for my use case.
(Ironically this was supposed to be a boot disk, I can’t imagine how slow something like Windows would be on such disk, plus the extreme wear due to write amplification).
Sorry for bumping an old question, I didn’t feel the need to create a new discussion for this follow up.
Lately I restored a backup from my disk and found out that one file got corrupted on the drive. Fortunately the file had a copy on offline media so data loss was avoided. However, seeing that it isn’t unlikely for my data to be corrupted while sitting on this disk, I’m reconsidering using BTRFS without Copy on Write (to avoid issues with the firmware managed SMR), because BTRFS has checksums and can scrub the drive to detect corruption.
Is it a good idea? Or should I just tell rsync to verify checksums of the destination as criteria for hard linking?
COW should not have any negative impact in your case really - using rsync with either Btrfs (COW or nocow) or another file system will likely cause more IO as rsync needs to checksum the file again, while a Btrfs aware backup program will reuse the existing checksum
Are you sure the disk is not going bad, or as earlier suggested, it might have been unmounted improperly?
This is the real issue I would look at. SMR (Shingled Magnetic Recording) seems to me the worst tech ever developed for HDDs. There are layers of recording on the drive, and once the drive has enough data for the layers of data to become shingled (overlapping) then every write requires reading the existing data, copying it out, writing the lower layer, then replacing what was over the new data position.
IMHO that creates many possibilities for data corruption as well as drastically slowing writes to the device. The slower writes are a fact of the design.
If one is worried about data corruption one really should consider replacing the SMR drive with a CMR (Cylindrical Magnetic Recording) drive that does not use the “shingled” data structure.
There is, of course, a difference in cost.
A WD 2 TB Blue drive (WD20EZAZ) drive (SMR) is about $60 today.
A WD 2 TB Red drive (WD20EFZX) drive (CMR) is about $75 today.
The Red is an NAS/Enterprise style drive and intended for continuous use while the Blue is intended purely for home use.
I personally would never consider using an SMR drive with BTRFS file system since it must do so many more reads and writes over time – particularly when it is over 30 - 50% filled with data.