How to switch off systemd's out-of memory killer daemon

I am working in scientific computing. My simulations occasionally take large amounts of memory. Since switching to F36 I got random crashes that I couldn’t make sense of.
Apparently this is due to systemd’s OOM going rampant and killing everything in the same control group as my simulator.
Thus, when I am running a large-scale calculation that triggers OOM, it kills

  1. the simulator
  2. The GUI that called the simulator
  3. the shell that started the GUI
  4. the tmux server that created the shell
  5. the terminal window that created the tmux server

This is completely counterproductive for my use case, and how this could be considered useful by anyone is beyond me. I would like to switch off the OOM completely or stop it from interfering with the processes that I run. Does anyone know what is the standard procedure for this?

2 Likes

https://www.freedesktop.org/software/systemd/man/systemd-oomd.service.html

systemctl stop systemd-oomd.service    #stop service
systemctl disable systemd-oomd.service   #disable it, so won't start on boot
systemctl mask systemd-oomd.service     #prevent other processes to start the service
5 Likes

I’m pretty sure that switching off the OOM killer is a terrible idea.

I’m even more sure that the important thing you need to try is to fix the configuration of your swap space.

By default, Fedora uses zram swap, which compresses stale pages within ram, rather than paging them out. I can see why, for a typical user with more ram than they really need, but leaving open a few too many applications that they aren’t actually using, zram can be a modest improvement to performance. But I think that is a narrow range of benefit and hard to really identify.

In general, but especially for “scientific computing”, zram is a terrible idea. You need real swap space.

After you figure out how much real swap space you need and configure it, the OOM killer will no longer kill anything, and your application will continue running.

Depending on the memory access pattern within that application, maybe the whole problem is then solved. Vs. maybe you just don’t have enough ram for that application to complete in reasonable time. Vs. maybe you have a memory leak in the application and adding swap space just delays the failure.

Another warning is that you may need to be patient when the system seems to be hung (when running a simulation with properly configured swap). You may be paging out key parts of the GUI when you stop using them for a few minutes while waiting for the simulation. Then when you try to use them again, everything seems frozen. No OS designer has put in the research needed to design getting the GUI system back from swap in reasonable time. The process that brings it back when you start using it again eventually works, but takes longer than would make any sense if you don’t understand what happens behind the scenes to make it complicated.

It is not actually counterproductive for your use case and even less so for ordinary users. If you have a memory leak in an application, you want that to bring down the application rather than lock up the whole system. If your application is simply using more memory that your ram+swap permits, you still would rather have the application fail rather than lock up the whole system (including that application).
You have described, and maybe even really have, a boundary case: The randomness you claim would imply you have just barely enough ram plus swap for what you’re trying to do. It is certainly possible in a narrow range of such boundary cases for the OOM killer to kill something when no such action was really needed. But even in that boundary case, the right answer is to increase the amount of swap so you aren’t in that boundary case.

1 Like

I suspect that the fix, short of revising the software to be less memory intensive, would be to provide adequate physical swap space so the oomd does not see that it is running out of memory. Fedora uses zram virtual swap in RAM by default and it is up to the user to extend that if needed. You can add more RAM or add physical swap.

Physical swap can be a swap file created on the system or a physical partition on the drive. How to create swap is described here as one link I found
https://www.techotopia.com/index.php/Adding_and_Managing_Fedora_Swap_Space

1 Like

Yes, but make sure you totally ignore the section that says Recommended Swap Space for Fedora.
That only applies to people who have ram reasonably sized for their workload and have a typically structured workload. But most computer buyers don’t get ram reasonably sized for their workload. They usually get way too much because it is cheap, or they might get too little. But the actual need for ram rarely is considered when deciding how much to buy.

Assuming no correlation between the ram you bought and the ram you need, the usual positive correlation between the ram you bought and the swap you are advised to configure is backwards. If you bought more ram, you need less swap.

But once (like the OP) you know you have an issue, those guidelines are even more counter productive. You need the amount of swap that you need for getting the task done and you already know that is more than a typical user with your amount of ram would have any use for.

Once I know that there is an issue of swap space being too small (as we do here), I wouldn’t even think of messing around with a swap size smaller than total ram size (those guidelines are for swap 1/4 to 1/2 of ram or even less). If the current state is “almost working” (as might be guessed from the original post) try swap space equal to total ram size. But I wouldn’t be surprised if you need more than that.

ram is cheap, but disk space (or even ssd space) is cheaper. My opinion is that if you are actually using 1/4 of your swap space you haven’t configured enough for safe operation.

Thanks for the detailed feedback, I really appreciate it. Actually, I have lots of swap configured in my system but somehow OOM seems to go on its killing spree well in advance of any swap filling. I guess this has to do with the default memory pressure setting.

The part I find extremely counterproductive is not that the simulator itself is killed… that is what I would expect if it tries to allocate more than is available.
The problem is, it kills the whole control group that includes a ton of processes that don’t have anything to do with the simulation (the simulator GUI, the IDE, the friggin terminal window all was started from, …). I have been working in this field on and off since 2007 and an out-of-memory state has never been treated like that and frankly I fail to see under which circumstance this would ever be a good Idea.

I guess the best thing to do to remain consistent with the idea of OOM would be to somehow create a rule that puts the simulator in its own control group, but I haven’t figured out how to do that yet.

[EDIT] Turns out the setup does not have as much swap configured as I thought. I usually increase that from the default value when I set up a new system, but seems I forgot that this time. Still it is 8GiB in my case.

Franz, another thing you may want to try is adjusting the settings of OOMD to be less agressive.

see: oomd.conf

SwapUsedLimit=
DefaultMemoryPressureLimit
DefaultMemoryPressureDurationSec

Maybe let oomd act later, or let oomd give the process(es) more time of high memory usage before killing processes.

2 Likes

Thanks for the feedback. I have 64G of memory and 8G of swap. It seems, however, that the swap doesn’t even reach an appreciable degree of swap usage somehow. But this is a good point and I will investigate this a bit more.

Thanks, yes I have already looked at OOMD configuration. If it turns out that the OOMd is actually good for something (in my setup that is), I guess I’ll just set everything to maximum.

2 Likes

Have you tried just fixing that, and seeing if you still have a problem. For serious computing that needs 64G of ram, 8G of swap is absurdly low.

How have you measured that? I don’t know if any high water mark stat is available and if so whether that is what you used. If you looked before and after the OOM killer acted, it makes sense that there would be no significant swap usage at those moments, despite swap being 90% used at the moment some process using a lot of swap was killed.

As you noted, the process using a lot of ram tends not to be chosen to be killed. Its memory access pattern likely forces related processes (especially processes waiting for it) to use swap, rather than using swap itself.

Last time I was doing such memory intensive simulations in Windows, I found the flaw that windows often reserves swap space for a process that won’t actually use it. In extreme cases you can exhaust memory with all swap space reserved and none actually used. The only way to get the job done is to have even more swap space, despite actually having enough ram. I don’t think Linux has any similar flaw. But I’m not really sure.

@franzschanovsky : I just want to note this here since I had similar issues on tmux and did manage to find a fix:

https://discussion.fedoraproject.org/t/how-would-one-create-new-tmux-servers-each-isolated-in-a-separate-slice-so-that-if-systemd-oomd-kills-one-the-other-tmux-servers-keep-living/65315

TLDR: since tmux runs in a terminal window, all the tmux sessions/panes are part of the same cgroup/scope—the one that was created for the gnome-terminal instance. So the workaround is to open each tmux pane in a new scope, since oom limits apply to per cgroup/scope.

3 Likes

Thanks Ankur, I have found your ticket report very illuminating and have read it before writing my question. Your workaround is certainly a good idea, but I felt I needed something more radical. This default behavior of OOMD is causing a lot of unnecessary trouble and will throw off many people from the numerical simulation community. I just needed a quick solution that I can communicate to other people as a workaround irrespective of their terminal multiplexer or desktop environment setup.

1 Like

In that case, I think disabling it is probably the best bet. I have looked into configuring it instead of disabling it, but I haven’t quite hit on a config that works yet. :frowning:

2 Likes

Good luck with that!
(and kindly let us know when you found it :slight_smile:)
I mean it would already be enough to have a config for OOMD that limits its attention (and consequently its action) to the process itself instead of the whole control group. Or even to have it ignore processes that have been started by a certain user/group.
When googling for solutions I came across someone who claimed that LinuxMINT even dropped OOMD entirely from their default setup due to a ton of complaints they got.
We will see how this plays out.

2 Likes

UPDATE: Even after issuing these commands OOMD somehow came back online.
This is just ridiculous. Almost like a bad joke, I then couldn’t stop it by hand at first because it was masked. Now I changed the settings in /etc/systemd/oomd.conf to all limits 100% and the “grace period” of 30 days. Let’s see whether that makes this horrible annoying memory watchdog with anger management issues stop interfering with my work.
I don’t know who came up with the line “The most stable system is the one that isn’t used.”, and I could not be bothered to google it right now, but this seems to be the design principle for this abomination.

1 Like

Thanks John for your fast reply.

I understand the benefit of the OOM when the application is requiring more memory than the system can provide but I am also using applications running with only 80% of total memory (physical) and the OOM is killing the applications, so there is a problem there. I am also trying to install Tensorflow, for example, and as you said, maybe there is a failure because is using all the RAM and all the SWAP memory (something that I cannot explain during the installation process) and the OOM is breaking the installation. I would like to know if there is a way to limit memory usage by application. It looks like I need to buy more memory. My system is only 4GB.

Thanks!

If your system only has 4GB RAM and you do not have an additional physical swap then it could easily explain the OOM issue. Default swap with fedora is in RAM so you would only have a total of 4GB including swap. The quickest fix may be to designate a swapfile so the system has physical swap to use in addition to the zram swap space. Easy to do.

The link I posted above tells how to do so and all you need to do is adjust the size to whatever you choose.

Thanks, Jeff. I will try to adjust the size of the swap. I hope to solve all the nuisances with the OOM.

Thanks

Make sure you remember the difference of zram swap vs. the other kind. That is all discussed earlier in this thread. But it is a long thread so you might forget or overlook it.

Fedora by default configures just zram swap. Increasing the size of your zram swap is very unlikely to help. You need either a swap file or a swap partition (see the link in Jeff’s first post of this thread).

You could have that in addition to the zram swap. I think it is better to have that instead of the zram swap.