I have high storage requirements and have installed two 12TB HDDs to my computer with plans to add a third (likely 18TB), then another when the budget allows. Currently all are mechanical drives and ext4-formatted (the OS SSD is btrfs, the Fedora default).
I’m looking for a solution which sensibly allocates space on the background when I move data to /mnt/backup. Setting up RAID is not a task I’m looking forward to and as far as I know any drive failure generally results in total data loss. I want that the directories and files are fully stored a mounted HDD, not spread around in parts.
It seems that mergerfs may be a good starting point, but I’m eager to hear about any alternatives.
Using LVM, you can combine the storage of all three disks (physical volumes) into one volume group (VG), and have one or more logical volumes (LV) on which you can create a file system.
The issue with that is that if one drive fails, the entire LVs are gone. (Not really an issue since you have backups).
The big advantage of LVM is that is scales up and down very easily, add or remove storage as you want.
With RAID, that is the opposite, you have different configurations (mirrored drives, parity drives and so on), that allow the failure of one or several disks without loss of data.
For my use this sadly is a major drawback since I don’t have full backups for my media storage. As I can recreate the data for one drive with relative ease (mainly rerip DVDs and blu-rays) I decided not to limit my total capacity with local and off-site backups.
I would likely set up LVM otherwise. My OS drive is carefully backed up.
I personally prefer to use a combination of raid 5 and LVM on that array. With raid 5 you can lose one drive and physically replace it without any affect on the data contained. Yes, you do lose the space for one drive in an array of 3 or more drives, but for data security it is much better than LVM by itself.
As stated above, LVM by itself has no tolerance for failure and a single drive failure wipes out the entire data store. OTOH, raid 5 allows one drive failure without data loss.
What linux raid does with mdadm managing a raid array is create a virtual drive (raid array that looks like a drive to the system) and allows the user to use that device exactly as if it were a single drive (no matter how many physical drives make up that array), including partitioning, formatting, mounting, etc. Once configured the only thing the user needs to do is periodically monitor it so a failed drive can be noted and replaced if needed. Very simple and easy.