There are still lots of folks running magnetic rotational disk drives, On such systems, running btrfs, regular disk maintenance is required to avoid performance degradation and other problems. There are packages associated with btrfs to accomplish this maintenance, but they are all command line tools. I’m sure people would appreciate some help in regard to which ones to run, what parameters to use, and how to incorporate them into a script they could schedule to run periodically. I understand that btrfs is a proposal and may not be implemented, but just in case. This sort of change is the kind that could cause lots of gnashing of teeth if there isn’t a nice way to handle the maintenance. Sorry my lack of background with btrfs does not allow me to write such an article(s).
Any volunteers to write the article?
I don’t especially want to write an article, but I will say what I do, based on advice I have gleaned from the btrfs mail list, including one post by a btrfs developer specifically about recommended maintenance. I do all of this on a weekly basis.
- btrfs fi defragment -r (on each btrfs filesystem)
- btrfs balance start -v -dusage=85 (on each btrfs filesystem)
- btrfs scrub start (on each btrfs filesystem)
Notes: It was specifically cautioned not to use -musage in the balance, saying that can make things worse instead of better. The three steps should run sequentially, not concurrently. Many people will say not to defragment btrfs filesystems on SSDs; I do it, anyway. Finally, 85 should be the minimum -dusage; it was stated that doing it with smaller percentages does not help much.
The fact that we need to perform some sort of file system maintenance on a typical Linux workstation/server seems like a huge regression from the point of user/administrator/developer considering that today is not 1995 but 2020.
So, if someone is going to write this article, I hope they clarify some points:
- What are the pros and cons of using Btrfs compared to Ext4/XFS+LVM?
- Is Btrfs stable enough to utilize on a typical Fedora Workstation/Server?
- If Btrfs is stable, why does it require some sort of maintenance?
- If Btrfs actually requires the maintenance, why is it not yet implemented as an automated systemd task and enabled by default?
defragmentation is discouraged for SSDs since the excessive IO lowers the lifespan of the disk.
Scrub is OK to do, it just reads everything on the drive and compares to checksums. Verification with checksums is done passively every time a file is read. A scrub isn’t something folks need to do for a single device Btrfs, but if you want to do it, it’s OK. But frequent defragmentation and balancing is something of a legacy artifact of bugs that people shouldn’t be running into anymore. And if they are, they need to be fixed rather than papering over them.
seems like a huge regression
I think that’s a reasonable position to take. As it it turns out, all file systems have aging effects. and I don’t think users should have to be bothered with these things. While I’m using Btrfs on everything from a Raspberry Pi Zero, to a newish laptop with NVMe, I don’t have any computers with sysroot on HDD anymore, just data drives. I do not
balance them, ever. I do scrub them periodically - which verifies file system metadata and data with checksums. This verification happens passively in normal use.
The Btrfs by default change proposal benefits hopefully answers most questions.
From the devel@ list discussions with Facebook folks using it in production on millions of systems, and many millions of instances, yes it’s stable. Also, (open)SUSE has been using it by default for ~6 years. I think the biggest challenge is changing expectations in case of the rare hardware failure or firmware bug - the usual course of action is to go straight to fsck. But on Btrfs it’s to try and take advantage of its ability to tolerate read-only mount despite problems, get important data out or freshen backups; and then attempt to repair. Btrfs has more opportunities to recover data, and that adds complexity. Some of that needs to be automated and made simpler.
There is an (open)SUSE specific maintenance package, as you describe, and it is regularly scheduled behind the scenes. My preference is that these problems just get fixed in the kernel, in the file system itself, rather than papered over with timed tasks. But yeah, if it’s really necessary as some kind of stop gap solution, that’s better than asking users to have to start baby sitting their file system - that’s not reasonable.
I think this is a relevant topic that is also timely. +1 from me, and no I am not able to write this article. Perhaps @cmurf knows of someone who could