Random System Crashes F40 KDE

It isn’t memory leak, nor OOM. I’m using a 16GB MacBook Pro, and I’m well aware of my OOM issues and I was trying to manage them with things like LRU Gen. The recent crashes are a different thing and not related to memory issues. Don’t know what, but it might be related to graphics or something. If anybody still has F39, I suggest trying older kernels to see if it is kernel related or not.

These crashes are much more ‘suddent’ and unexpected than OOM hangs. And as others said, the system automatically reboots.

I was replying to Vladimir, who brought up a completely normal OOM behavior that is not what this thread is about.

The crashes are also OOM-related but not expected behavior. We already have a lead on what is the likely culprit. TL;DR running out of kernel stack on memory allocation during low-memory conditions with zram leading to a kernel panic.

2 Likes

Also interesting behaviour: if it crashes while playing media (which it just did), it loops about 1 1/2 seconds of audio repeatedly until it turns off.

Yeah that’s also normal when you get a kernel panic (with modern audio frameworks that actually use big buffers for power efficiency)

Great, thanks!

BTW, don’t know if it helps, but it’s a new behavior since probably a month ago or two (while I had OOM freezes but surfaced differently).

This should be fixed in the latest kernel. Specifically, kernel panics (with the symptom that the machine freezes then automatically reboots, and in particular this happening when there is still swap space available) should no longer happen if the root cause is what we think it was.

If after updating the kernel to 6.8.10-400.asahi.fc40.aarch64+16k someone still experiences the kernel panic symptoms, please let us know. Note that any other symptoms (slowdowns, freezes with no reboot, etc.) are a different problem and unrelated to this bug.

5 Likes

Thanks, I’ll now try to stress the system more instead of less and will report on my experience.

Thanks Hector for your clear communication and fast actions! These kernel panics happened rather often so I should be able to confirm quickly if it’s solved. Note however that I had kernel panics (with successive reboot) only for about 50% of the cases; for the other cases the machine froze/stalled and I had to reboot it myself.

The question is whether you were running out of RAM when the crashes happened (sans reboot). I you were (full or near full swap, or most of RAM taken up by shmem which is non-pageable GPU) then this is a known misbehavior, but your system was randomly going to kill apps anyway so no matter what you are simply reaching the limits of your system.

If you still experience freezes like that and they happen with free RAM or swap (real swap, not just zram) then there is another problem.

1 Like

Now I had 2 full work-days working without a care in the world and trying to stress the system more than usual and didn’t have a single crash. Before that I’d say I had between 5-7 a day where I was quite careful.
So either that’s quite the coincident, or it really helped a lot :smiley:

Thanks again!

2 Likes

I hear you but difficult to say since when the crash (stall really) happened, I couldn’t check because well, the machine was stalled :slight_smile:

However as reported by others, it seems fixed with the latest kernel! No crashes so far and the crash frequency was high enough so it should have reproduced itself by now.

Thanks again to you and your team for the excellent work!

Sad to report that my system just crashed - I was stressing it a bit more on purpose and Firefox had 16 tabs open of which one was streaming youtube. I could feel the machine getting slower but no OOM kills happened and at some point the whole machine stalled. I waited several minutes but it remained frozen, upon which I rebooted it manually.

I’ts a mac mini M1 with 8 GB of RAM and 8 GB of swap. I was monitoring memory usage a bit and it was using 90% of RAM and 80% of swap. I could of course increase swap further but shouldn’t the kernel kick in and kill Firefox or any other application?

That is the known lockups on real OOM issue (notice how the machine didn’t reboot on its own). That’s a different, known problem, and not a recent regression (we’ve always had it). You were actually running out of RAM, so something was going to get killed either way. The bug is that sometimes something goes wrong with memory reclaim and the system locks up instead of OOM-killing something. We’re not sure what’s up with that.

We’ll figure it out at some point, but we can’t conjure RAM up from thin air, so there is no way to make the running out of RAM experience pleasant (the best we can do is slowdowns and OOM killer as on any other system). That’s a usage/application-level problem. Specifically for Firefox, try turning on the browser.tabs.unloadOnLowMemory pref and that should help stop it from gobbling up too much RAM under low memory conditions.

Aside, do you have 8GB of real swap or just 8GB of zram? If you have an 8GB machine you should have 8GB of zram and 8GB of real swap (total 16GB of visible swap). You can check swapon -s for the details. If you don’ have a swapfile, that means you installed prior to the stable release, and you need to run sudo /usr/libexec/fedora-asahi-remix-scripts/setup-swap.sh to enable swap.

1 Like

Right so the kernel panics are definitely gone indeed, not the lockups but as you said this you were aware of already. Indeed I installed prior to the stable release, I must have missed the swap activation - done now. Only know I learned that zram is not the same as a swapfile, so it does make sense my machine was running quicker into OOM problems.

Will report back how it goes. Thanks again!

I don’t know if Asahi functions any differently, but even with a 900 GB swap partition I still had Plasma 6 OOM kill’d OOM kills Plasma 6?

Note that for Asahi specifically GPU memory is not (yet) swappable (shown as shared memory in memory stats), so if you end up using a significant fraction of your RAM on that, it won’t matter how much swap you add, you’re still going to run out of RAM.

And yeah, the swap should help a lot. 8GB machines really need swap to function properly in day to day settings (on both Linux and macOS). The NVMe is fast enough that swap thrashing isn’t much of a problem, and there is usually enough cold stuff in RAM to make swap effective. This is why we turn swap on by default for 8GB and 16GB machines (above that we consider it optional and don’t enable it by default).

2 Likes

@marcan
Thanks, I also seem to not get the crash anymore.

Like others, I’m trying to find an optimal configuration where OOM killing kicks-in before the system stops responding. I was wondering why it doesn’t happen, but maybe your ‘non-swappable GPU memory’ explains it to some extent… :thinking:

Hmm… I’ll try disabling ALL swap space to see what happens when memory becomes full

With Multi-gen LRU, I sometimes get proper OOM-killing behavior, but it still doesn’t work in many conditions. I’ve not yet tried other solutions like EarlyOOM or tinkering with systemd-oomd or other oomd solutions.

OK that info helps! One last question regarding priorities, are the following ones correct?

NAME               TYPE      SIZE USED PRIO
/var/swap/swapfile file        8G   0B   -2
/dev/zram0         partition 7.4G   4G  100

Yes, that looks right.

@marcan I think we can close this topic and have your post mentioning the fixed kernel version as a solution. I tried to find out how I as the opener could close it but came up empty. (I feel very stupid :D)

1 Like