BTRFS - I surrender! Good features BUT Complex, wastes time, not KISS, does not play well

I would first state that I respect your opinion to prefer or use a different filesystem and I would acknowledge that btrfs isn’t the best choice for every use case.

I would say that after reading through everything it looks to me like the majority of your issues are self-inflicted. Btrfs in Fedora is deployed in a fairly simple way by default. It doesn’t require any advanced knowledge or special skills to use.

However, if you have the time and willingness to learn about it, it has many advanced features that other filesystems don’t have. Of course, you don’t need to use or learn about them and can generally use it as a simple filesystem. I suspect that many Fedora users don’t even know or care that they are using btrfs.

On top of that, the current state of timeshift is fairly terrible. It has been on life support for a while now and has many issues. The good news is that there is recent work by new maintainers so hopefully it will progress. That being said, I know many people using Timeshift with btrfs without any issues.

To be clear, I am not trying to convince you to use it. If you prefer a different filesystem you should use what you like. I just think that some of the statements in this topic are a little over the top and wanted to share my perspective.

Never said raid doesn’t matter. Why do you think I am doing backups anyway? Of course SSDs fail doooh!

But when you look at the data SSD in normal use are more reliable until they reach an age of around 5 years (it’s not just writes, the semiconductor ages), but compared to 4 or 5 hdds raid I used to use, the cost and electricity, ssd with single backup ssd is cheaper and should be as reliable.

BTRFS did interest me re it’s better checksums to compensate for no raid. But alas didn’t work out.

Re timeshift - works perfectly with supper KISS on HDD. Didn’t on BTRFS in same configuration.

People stop judging me. I have been an engineer for 40 years. Grew up from punched cards and PDP11 & VAXs to High Performance computing including F18 flight simulators, electronic warfare, media rendering and editing, scientific computing and simulation, a scientific UV telescope flown aboard the space shuttle etc. I even invented & build a 2GHz computer in 1994 using GaAs bit slice technology from Vittesse for Electron Pulse resonance system for an University. At the time fastest PC chips ran at 60 - 100 MHz.

After 40 years of experience and many fashions or computer fads I’ve seen a lot. Eg cloud computing a disguise of central computing of the 70s era with dumb terminals hanging of rs-232 lines replaced by internet and rj-45 plugging into now pretty downgraded PCs with much of the smarts centrally located and charging 10x more when you capitalise the annual cost while the central computer operators (sorry data centres) have you by the balls eg Google, Meta, Amazon, MS etc. The whole point of distributed computing in the 90s with the invention of the PC was to get away from that central control and reduce cost for consumer and small business. Gone full circle. Now for those wanting jump down my throat on that OBSERVATION & OPINION, it is not an attack or critique of current technology it is an observation that can see through the hype based on decades of experience.

Anyways I divert.
Re BTRFS, My opinion which I still stand by, my experience, my solutions for my use cases.

Doesn’t BTRFS require a RAID for checksums to be useful in error recovery? Otherwise it just detects errors.

I agree, it screwed my system multiple times last year on rsync+ext4 backups (I think it messed with SELinux), now using snapper+BTRFS for almost 1 year without issues.

No worries, no judging here. As for my reasons that made me want Fedora to use BTRFS as default … even the author of ext4 was saying it was time to use something more modern … a modern Copy on Write filesystem … open source … ease of resizing (actually, not even a problem I consider anymore) … snapshots!

Yes, of course.

I’m not sure, but detection is the most important part. btrfs prevents you from reading corrupted data, so bitrot caused by storage failure is impossible. That is the single most important feature of btrfs IMO.

That’s what interested me re BTRFS - bitrot detection. I know about the raid re checksum and the inherent repair. But I had limited budget and disks of different sizes with 2 being Samsung 830s. So for BUPs I wanted better checksum. After all SSD age has a big influence on errors. Those older SSDs have not been written much.

So btrfs appealed to me a few months ago. I was open minded. I wanted to give it a go. It’s the practice and time that turned me off. I just want a backup disk running borg and timeshift not an entire lecture course on btrfs that I had to go through to join the 3 disks, manage them, integrate crypto and this list goes on to integrate it as a working system and then timeshift didn’t like it. That were btrfs lost me. BUt I’ve kept it for the borg bups.

Note Initially considered ZFS and compared BTRFS. There it seemed that BTRFS had lower learning curve and ZFS was over the top for me. Definitly aimed at RAIDs. Great for Data centre not so great for small network and professional workstation and some home computers.

Note also re your comments the default Fedora comes with BTRFS default. My initial install was 2 years ago with F36. BTRFS not there then from reading at the time. Besides I was coming of Centos 7. Centos people are conservative. So did install with ext4 and XFS for folders holding larger files like multimedia.

Also I do scientific and media computing. Need top end processing performance. BTRFS is slower then ext4 & XFS re performance. So BTRFS best suits me for backups storage where that performance is less important. Just store it reliably and tell me if there’s a problem.

Linux or Unix is used often used in high end peformance computing often limited by I/O throtteling (media rendering). BTRFS is an anathema to that from what I’ve learned. Not a critique just XFS is better for that.

Re more modern file system then ext4. I don’t disagree. But modern does not mean more complex or not backwards compatible that older programs such as timeshift or many developed custom scripts fail. A change in filesystem should not force the users to change all their apps. Eg. Colour television introduction ensured that they could still deal with monochrome and other older communication standards of the time. It takes time to transition. Don’t force a time line on the user because as a developer you want to move forward without regard for any one else. As Linus Torvald said (or something like it), don’t break the OS with your upgrades. The fact that Timeshift does not work on BTRFS as it’s backup yet ext4 or others means the btrfs file management interface is breaking some apps.

A fs should never do that. BTRFS commands should work seamlessly with ls du df fdisk gparted lsblk and all the others and provide the same and correct information without having to resort to btrfs fi … etc. Then the interface with older apps gets broken. That’s what creates the complexity and makes a mess of KISS.

I think zfs can be a good fit here too. I certainly use it this way.

However, zfs has a significant learning curve. Much higher than btrfs. zfs is not at all the right choice unless you are willing to spend time learning and planning your filesystem up front. It definitely isn’t for everyone.

On my initial research considered zfs vs btrfs. Same conclusion zfs even more involved then btrfs. Concluded zfs for large complex multidisk storage arrays, i.e. data centre. Not worth my time. Hence I gave btrfs a try.

If I was using raid then I would definitely go btrfs over mdadm & LVM. Btrfs then would win hands down due to checksum, self-healing etc.

But as the main rootfs or where you want performance and redundancy/reliability is secondary then I will stick to ext4 or xfs. That’s my conclusion anyway.

You should always have multiple copies of important data. RAID is good for increasing uptime (and for convenience), but is not sufficient. You really should have backups. In that case, a btrfs error tells you when you need those — or, when there’s a problem with the backup itself.

1 Like

That was the plan. But the external ‘rugged’ hdd I chose from ADATA appears flawed. It can’t handle a stream of data beyond 300 GB (Capacity 1.8TB) and then seems to throttle from normal usb3 speed to < 100 kB/sec making it completely useless. So need to wait for a replacement. I suspect it’s inherent in the design.

Be wary of ADATA and it’s claims re it’s ‘rugged’ disks! They are cheap - for a reason?

Maybe it is a SMR HDD? Those can become very slow if not given enough idle time to organize their data.

Oooh thank you for that didn’t know there are different flavours of them now. That certainly could explain it ,but manf never mentioned it and gave me an RMA numbers to return it when I complained. Spec sheets didn’t mention this either.

Per wiki article a controversy as manufacture do not advertise it. Not a problem here in Oz as the drive needs to be ‘fit for purpose’ as a slow time to record makes backing up of man 100s to 1000 GB impossible.

Nasty!

I have one of those nasty drives as a secondary drive in my laptop. After relatively large data transfers even when disk activity LED stops and the disk is idle, I can still hear its actuator moving around exploiting the idle time to flush its buffers or whatever. Definitely will watch out to get a CMR for my future NAS.

I’ve been using btrfs for as long as it has been default without any perceived issues, but decided to go back to LVM + XFS for Fedora 38 to align the system closer to RHEL. Interesting I may have stumbled on a good choice.

One thing that started working notably better for me was GNOME Web. When I tried GNOME Web 44 on my F37 SB with btrfs it would frequently timeout when loading pages. As soon as I upgraded to F38 and made the filesystem switch it started working perfectly.

To be fair, I didn’t go back and completely isolate the issue, it may not be that at all, but I am certain GNOME Web was practically “unusable” before and then it was perfectly usable. It was a flatpak and the version didn’t change between installs.

Interesting, but I don’t think BTRFS has something to do with it. (Unless it was running out of free space?)

I don’t know where I surfed across it on the web, maybe toms hardware but article comparing btrfs performance recently to xfs, ext4 and btrfs was slower then all the other. Measures on latency where particularly subpar compared to the others. Since my system uses ext4/ XFS on rootfs I can not make a comment on the issue you highlight but btrfs is known to be slower in other comments and on roofs it may slow things like you observe down but a complete time out as you observe is doubtful but I don’t know.

Your experience if other have it as well would suggest a controlled comparison test needed comparing linux roofs performance vs other fs in real world apps. Surely that’s been done somewhere before a decision was made to make it the default rootfs???

BUt subjectively your observations of faster performance of roofs re web pages when using xfs + LVM could make sense. Also definitely different behaviour by other apps when working with btrfs as I noted re timeshift. Interface with older apps, i.e. regression MAY not be 100% and so occasional gremlins will show up.

A classic developer issue of our interface is more modern then the old, we’ll make it work but change to our cause it’s better. In other words let’s fix something that wasn’t broken and guess what happens per Linux Torvalds critique, you break linux.

FS are the cogs of linux. Be careful changing the meshing of the cogs or the friction of the OS increases, that would be my analogy.

Modern features are not the only measures that should be used in adopting default FS’s. It’s how well they work with the admin tools and general operation and use. To avoid friction extensive regression testing should have been done. I don’t know the answer to that but my minimal experience and observations to date raises ‘???’. Your performance observations adds another small log to a small wet smoky fire. It confirms my decision to be conservative and not use btrfs for the rootfs (i.e. default) for a little while yet.

The implication, those who want performance, stick to ext4. Those with multidisk roofts and who want reliability, fill your boots with btrfs but maybe expect a few odd things happening with apps.

I went back and tried to reproduce the issue in Boxes and to be honest I couldn’t. It was far-fetched to begin with… even though Web didn’t change between installs the entire base OS did. So I apologize for even bringing that up.

Again, I didn’t necessarily have any issues with btrfs. But I was reading why Red Hat dropped it and what the latest was on whether they were ever going to bring it back, it just doesn’t seem like it is even a thought right now. There was one particular Reddit post where a lead engineer basically said Red Hat has made a bet they can implement most of the features of btrfs with less bugs and vulnerabilities. Also it goes against the Unix philosophy as it is trying to pack a bunch of features into one solution (to be fair, so does systemd and it has proven invaluable).

The technical reasons given at the time for going btrfs by default definitely made sense. For me personally, I wanted to align my system a little more with Red Hat’s philosophy as I believe in them as stewards of software and want to help their cause moreso than Facebook, Oracle, etc. Ultimately though I am glad Fedora is working on btrfs and that people have options.

Interesting what you say re unix philosophy. RH is right and aligns with what I think re dabbling experience with btrfs re my backup drive.

The BTRFS cogs (analogy of mine) are not quiet right. This is causing potential ‘meshing problems’. RH would know more re the bugs, but I’ve seen hints of them re my timeshift experience.

This again aligns with what I believe is that btrfs, since it is per your statement not aligning with unix philosphy (i.e great new features but with complexity and different or new FS management interface), basically code for don’t fix something that ain’t broke or you break it or add interface friction.

I also see RH (IBM) sense that ext4 with LVM can incorporate many of the btrfs features. Evolution re the cogs, not revolution. So I would certainly support ext5 with LVM 3.0 improvements incorporating the features of btrfs better checksums, self healing re any raid/redundancy, ease of raid inline management without unmounting.

Re the old chestnut argument that btrfs is better on dealing with disk full situation then raises the question why did your disk get full without realising it before it’s too late? It usually means you don’t monitor your disks (I use system monitoring widget on my desktop and I always know the state of my disks, temperature etc. That btrfs argument is kind of like saying drive without monitoring dials, leave it up to Tesla to auto drive and expect nothing will happen and it will fix itself in utopia as AI knows all.

Those sort of statements come from people who haven’t experienced REAL live yet and spent far to much time behind a computer screen.

This effort is Stratis. It is available in Fedora Server for those who are interested.

Interesting, I like the concept. EVOLUTION - I like. Thanks for the link. May consider trialling it in a virtual environment.

Curious about how the extra layers of stratis impact the performance of the existing ‘lower layers’, i.e. how is the performance of XFS impacted would be interesting?

Will they do this re ext4?

Big move forward if after the pools setups and encryption, the normal fs management commands work as normal. Then all existing apps and scripts out there will be happy and regression is addressed.

Website description did not mention any self-healing?