Fedora 41 lower performance with BTRFS than ZFS

Hi all,

I am new to Fedora and recently install Fedora Workstation 40 (a few days before 41 came out)
My system has the following specs:
Gigiabit internet, hardwired into router. Router is some HP small PC running opensense with Intel nics.
AMD Ryzen pro 5650G 6 core cpu
64GB ECC DDR4 memory. I forget the speed but it is in the 2000’s. I am aware Ryzen prefers faster memory but I wanted ECC on a budget.
2X non identical 14TB hard drives. Neither are SMR, one is WD and the other Seagate
Multiple sata SSD not relevant to this post
Samsung nvme 980 pro 1TB for boot drive, installed using BTRFS
Server is used for homelab, so vms, docker containers so on and so forth.
Currently the only VM I have is home assistant with 4GB of memory assigned, and then I have a few docker containers. The system is mostly idle, and is using 12GB of ram. The rest of the moemroy is full of cached data.

So my experience is mainly Windows. Anything Linux I have used either Ubuntu or Proxmox.
I decided to try Fedora as I have heard good thins and so far I love it!
Due to my experience with Ubuntu and Proxmox, I am familiar with ZFS. When I set my machine up I installed OpenZFS, ignorning all warnings about not being able to update the kernel as easily. I configured the 2 14TB drives in a stripe ZFS pool.

After setting this pool up I started reading into BTRFS and thought I would give it a try. I wiped both the 14TB drives and set up a BTRFS raid 0 pool. This was relatively straight forwards once I found a good guide (like I said I am a Windows guy trying to level up to Linux)
What I have noticed, when writing to the BTRFS pool, it is very bursty. What I mean is I will be downloading and watching system monitor, network usage will fluctuate up and down from 10 MiB/s to 90 MiB/s constantly and my download client says the download rate is around 30 MiB/s. When I had these disks in the ZFS pool, the download client sat at around 80 MiB/s.
Reading from the pool is actually just as good as ZFS, if not a little better with BTRFS.

I just wanted to check if this seems right? The system setup was exactly the same both times, I only changed the storage.

I am more than happy to provide logs, but as a newb I have no idea what logs are relevant!

Thanks :slight_smile:

Sounds like latency spikes. It’s hard to say where they’re coming from - not least of which is there’s tens of thousands of changes to the kernel every cycle. And Fedora rebases the kernel coinciding to the most recent upstream stable kernel - so every few months we get a new kernel with a ton of changes.

For example, kernel 6.12 changes:

BPF is kindof of a micro-VM built into the kernel, and can be used to provide introspection into kernel operation as its running. It’s been around for a while, but likewise has been changing.

This is an older article that talks about it at a high level.

https://lwn.net/Articles/599755/

One of the things you can use BPF for is latency tracing. Three of the tools I use for this: btrfsslower, fileslower, and biolatency. This can help figure out if the latency is file system induced, VFS induced, or actually in the block layer (or device itself). It is possible some file systems mask hardware latencies, so just because some spike isn’t happening with one file system doesn’t mean the problem is induced by another file system. These layers all interact.

bcc-tools is packaged in Fedora, simply dnf install bcc-tools

I’m uncertain if upstream Btrfs folks will take much interest in A vs B performance comparisions with ZFS, but if you have a system showing significant A vs B difference between Btrfs and XFS (or ext4), they will take an interest.

Upstream mailing list info:

X-Mailing-List: linux-btrfs@vger.kernel.org
List-Id: <linux-btrfs.vger.kernel.org>
List-Subscribe: mailto:linux-btrfs+subscribe@vger.kernel.org
List-Unsubscribe: mailto:linux-btrfs+unsubscribe@vger.kernel.org

It’s not necessary to subscribe, you can just email the list and get responses from the web interface: linux-btrfs.vger.kernel.org archive mirror And also the archives are substantial so it may be worth searching for recent latency issues and see if something looks familiar.

Hi Chris!

Thank you for your detailed response!
Quite a few things for me to read into there. I will do some investigation, but I feel this may be way above my level.

Performance has been fine, I have just learnt to deal with the burstyness, performacne is by no means poor overall so no major issue.

Happy new year!

Kind regards,
Daniel

How did you create the RAID0 in the first place? Describe exactly what you did step by step.

Yeah the bcc-tools stuff could be a bit of a rabbit hole. But a ton of work has been done by that project to make it pretty easy to get started, and actually get a pretty good idea if and where latency spikes are happening. There’s quite a lot more tools than the ones I’ve mentioned.

ZFS is almost completely self-contained, implements its own caching. Whereas the Linux file systems are going to depend on the kernel for caching and delayed allocation, as well as being subject to a pile of performance tuning knobs that likely ZFS is mostly immune to.

There’s quite a lot written up on the subject of latency by Brendan Gregg and using BPF for it. This gives some ideas about the logic and strategy rather than specific tool usage. The bcc-tools have also since evolved. But for your issue it may turn out BPF isn’t the best tool to get an idea of what’s going on, and he discusses other tools to help narrow things down and figure out what’s going on.

On third thought, I wonder if a heatmap might reveal something straight away. These will get you started on that.