Btrfs "discard storm" on Fedora?

I found following discussion on arch sub-reddit on a weekend and I’m wondering if that’s the case for Fedora or if it’s “mitigated” in fstab defaults for btrfs?

I do see events, when I’m running sudo perf ftrace -a -T 'nvme_setup_discard', so I’m wondering if there is something needs to be done about it?

4 Likes

You generally don’t want discard enabled. Fedora by default enables fstrim that runs weekly on SSD drives. Having discard enabled means it’s constantly trimming and could both impact disk IO and lessen the life on an SSD. You shouldn’t ever need to manually do this on Fedora.

I understand that, hence the post. I’ve already added nodiscard option to my btrfs mounts in fstab, but currently as of 6.2+ on fedora we have now both timer for the trim and this async discard running which leads to the question, how should this be addressed more broadly in fedora distributions?

1 Like

As stated, Fedora uses fstrim with a timer so discard should not be necessary.

As I understand the discussion, the concern is that the upstream default changed from nodiscard to discard=async, and therefore if people haven’t explicitly put nodiscard in fstab, it will (unexpectedly, to most people) be enabled. Is that correct?

2 Likes

And, in fact, I see on my F38 beta system: discard=async for btrfs mounts.

2 Likes

Oh wow, yeah, that is different from what I expected. I see discard=async on my Fedora 37 system now as well.

Just added nodiscard to the / and /home option lines in /etc/fstab and rebooted and I can confirm that appears to restore the expected behavior. It’s definitely redundant to have fs-trim.timer and discard=async both enabled by default, and as far as I understand it, the fstrim.timer is better for overall IO and life of the ssd.

Many thanks for bringing this up, @agurenko !

4 Likes

No problem, sorry I probably should’ve been more clear from the beginning what the actual issue is :slight_smile:

@mattdm thanks for a good summary, that is correct

1 Like

On a re-read, what you posted makes sense. I was honestly taken off guard because I thought I understood how this worked in Fedora and was surprised to find a different behavior here than I expected. Sounds like something we should definitely review. I wonder if it might be possible to simply patch the btrfs driver and add nodiscard back to the defaults? This seems like if discard=async is the way forward, it would probably be good for that to happen during a major release with testing and communication around it, similar to how the fstrim method was introduced.

It’s also worth noting that the current, latest btrfs documentation still (apparently incorrectly) specifies that no discard is the default. Makes me wonder if this change upstream was really intentional.

Ref. Administration — BTRFS documentation

I searched and couldn’t find a Bugzilla issue for it yet, so I created it. Not sure if this should be considered for a Fedora 38 blocker or not? I went ahead and submitted it for consideration.

https://bugzilla.redhat.com/show_bug.cgi?id=2182228

2 Likes

I don’t know how or when this got inserted in my f37 Workstation, but I see that my /etc/fstab also has this setting. Of course I can change that, but now can you clear up something else for me, please? Do I also need to have fs-trim.timer explicitly included as an option? Or is this handled automagically?

Thanks!

The fs-trim timer happens automagically, by design in Fedora:

https://fedoraproject.org/wiki/Changes/EnableFSTrimTimer

The discard=async btrfs default appears to have been added in the 6.2 kernel, which recently hit Fedora 36, 37, and 38 Beta.

1 Like

Thanks!

Looks like discard=async might be less IO intensive than the old “discard” option? It looks like this change was eventually planned for Fedora by the btrfs team, but it shipping in 6.2 meant that there wasn’t a formal flip of the switch.

I feel less immediately concerned about it now, knowing that.

https://pagure.io/fedora-btrfs/project/issue/6

2 Likes

Also, the fstrim.timer will do nothing in the event discard=async has handled everything for you, so they can happily coexist.

1 Like

While this is true (and it was mentioned in the discussion), the main concern is actually the added load on io sub-system and the question of longevity for the consumer-grade hardware and the battery life for the laptops, in the event of discard=async, that is. It would probably be a good idea to get a final technical statement on how Fedora wants to move forward with it.

2 Likes

Okay, my bad, I missed this comment before writing up previous statement, but the impact on a battery life on a laptop is still a little concerning.

I agree. It sounds like this is a feature that’s been in the works for years and finally getting a release, but it’s coming out as a surprise rather than a triumph, so having some kind of statement about why this change is a good and not a harmful thing could be useful.

2 Likes