Weird memory usage on Kernel 6.8, fedora server 40

Problem

Server memory usage at 49.1 GiB / 62 GiB, while any process allocating large amount of ram cannot be seen in btop, top, etc.

Cause

Not yet known.

Related Issues

Not yet found.

Workarounds

That’s what we are trying to find. According to Reddit - Dive into anything,
transparent_hugepage=madvise would probably help but not in our case…

cat /proc/meminfo

MemTotal:       65020456 kB
MemFree:         1705116 kB
MemAvailable:   13536144 kB
Buffers:          254124 kB
Cached:         12016364 kB
SwapCached:           24 kB
Active:          9959956 kB
Inactive:       10182768 kB
Active(anon):    6032964 kB
Inactive(anon):  1884948 kB
Active(file):    3926992 kB
Inactive(file):  8297820 kB
Unevictable:        5516 kB
Mlocked:            5516 kB
SwapTotal:      65019900 kB
SwapFree:       65017596 kB
Zswap:                 0 kB
Zswapped:              0 kB
Dirty:               116 kB
Writeback:             0 kB
AnonPages:       7873928 kB
Mapped:          1648188 kB
Shmem:             40628 kB
KReclaimable:     331364 kB
Slab:             738092 kB
SReclaimable:     331364 kB
SUnreclaim:       406728 kB
KernelStack:       59968 kB
PageTables:       132292 kB
SecPageTables:         0 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    77050128 kB
Committed_AS:   38003816 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      466532 kB
VmallocChunk:          0 kB
Percpu:            50048 kB
HardwareCorrupted:     0 kB
AnonHugePages:     16384 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:     12288 kB
FilePmdMapped:     12288 kB
CmaTotal:              0 kB
CmaFree:               0 kB
Unaccepted:            0 kB
HugePages_Total:   20000
HugePages_Free:    14802
HugePages_Rsvd:      185
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        40960000 kB
DirectMap4k:     1221476 kB
DirectMap2M:    24172544 kB
DirectMap1G:    41943040 kB

cat /proc/cmdline

BOOT_IMAGE=(hd0,gpt2)/vmlinuz-6.8.9-300.fc40.x86_64 root=/dev/mapper/fedora-root ro rd.driver.blacklist=nouveau modprobe.blacklist=nouveau rd.lvm.lv=fedora/root selinux=0 audit=0 pcie_port_pm=off systemd.unified_cgroup_hierarchy=0 cgroup_enable=devices cgroup_enable=freezer transparent_hugepage=madvise

From Proposed Common Issues to Ask Fedora

What does free -h report? No screen shot please.

Why do you think is usage is a problem?

free -h

               total        used        free      shared  buff/cache   available
Mem:            62Gi        50Gi       2.0Gi        42Mi        10Gi        11Gi
Swap:           62Gi       347Mi        61Gi

The server memory free should be around 20GB, now it is around 10GB, and the memory usage cannot be traced to a process or a service from ps, even with sudo

There was a slab/slub leak fixed in 6.8.9 seen on rpi boards.
Maybe you are seeing something related?

You can check by comparing /proc/meminfo changes over time.

Also /proc/meminfo will tell you where the memory you cannot trackdown is being used.

Hi, I am not on a rpi device.

Here is /proc/meminfo

MemTotal:       65020464 kB
MemFree:         9171948 kB
MemAvailable:   14833124 kB
Buffers:          348148 kB
Cached:          5812304 kB
SwapCached:            0 kB
Active:         11084656 kB
Inactive:        1911280 kB
Active(anon):    6881936 kB
Inactive(anon):        0 kB
Active(file):    4202720 kB
Inactive(file):  1911280 kB
Unevictable:        5608 kB
Mlocked:            5608 kB
SwapTotal:      65019900 kB
SwapFree:       65019900 kB
Zswap:                 0 kB
Zswapped:              0 kB
Dirty:              1520 kB
Writeback:             0 kB
AnonPages:       6840808 kB
Mapped:          1486052 kB
Shmem:             41320 kB
KReclaimable:     272324 kB
Slab:             644528 kB
SReclaimable:     272324 kB
SUnreclaim:       372204 kB
KernelStack:       51592 kB
PageTables:       120496 kB
SecPageTables:         0 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    77050132 kB
Committed_AS:   34187592 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      457700 kB
VmallocChunk:          0 kB
Percpu:            50304 kB
HardwareCorrupted:     0 kB
AnonHugePages:     10240 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:     14336 kB
FilePmdMapped:     14336 kB
CmaTotal:              0 kB
CmaFree:               0 kB
Unaccepted:            0 kB
HugePages_Total:   20000
HugePages_Free:     9968
HugePages_Rsvd:      123
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        40960000 kB
DirectMap4k:      795492 kB
DirectMap2M:    16209920 kB
DirectMap1G:    50331648 kB

Can’t really see what is going on here…

What I did to see the rpi memory leak was take two copies of /proc/meminfo a few hours apart.
The use diff on the pair to look for what has changed.
From one copy I have no idea what has changed.

> cat /proc/meminfo
MemTotal:       65020464 kB
MemFree:         1605996 kB
MemAvailable:    3236872 kB
Buffers:          118200 kB
Cached:          2062368 kB
SwapCached:        14220 kB
Active:          3984332 kB
Inactive:        2466816 kB
Active(anon):    3301208 kB
Inactive(anon):  1038008 kB
Active(file):     683124 kB
Inactive(file):  1428808 kB
Unevictable:        5568 kB
Mlocked:            5568 kB
SwapTotal:      65019900 kB
SwapFree:       46488416 kB
Zswap:                 0 kB
Zswapped:              0 kB
Dirty:               316 kB
Writeback:           476 kB
AnonPages:       4272588 kB
Mapped:           871056 kB
Shmem:             63696 kB
KReclaimable:     255648 kB
Slab:            1010048 kB
SReclaimable:     255648 kB
SUnreclaim:       754400 kB
KernelStack:       66816 kB
PageTables:       206684 kB
SecPageTables:         0 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    77050132 kB
Committed_AS:   50948328 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      475456 kB
VmallocChunk:          0 kB
Percpu:            64512 kB
HardwareCorrupted:     0 kB
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:     12288 kB
FilePmdMapped:      4096 kB
CmaTotal:              0 kB
CmaFree:               0 kB
Unaccepted:            0 kB
HugePages_Total:   20000
HugePages_Free:     9969
HugePages_Rsvd:      124
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        40960000 kB
DirectMap4k:     5127012 kB
DirectMap2M:    21315584 kB
DirectMap1G:    40894464 kB

4 hours later:

> cat /proc/meminfo
MemTotal:       65020464 kB
MemFree:          856468 kB
MemAvailable:    1696108 kB
Buffers:           36060 kB
Cached:          1420512 kB
SwapCached:        15172 kB
Active:          5444532 kB
Inactive:        2114868 kB
Active(anon):    4973268 kB
Inactive(anon):  1199152 kB
Active(file):     471264 kB
Inactive(file):   915716 kB
Unevictable:        5496 kB
Mlocked:            5496 kB
SwapTotal:      65019900 kB
SwapFree:       47765344 kB
Zswap:                 0 kB
Zswapped:              0 kB
Dirty:              1280 kB
Writeback:             0 kB
AnonPages:       6103456 kB
Mapped:           728196 kB
Shmem:             64592 kB
KReclaimable:     257584 kB
Slab:            1026596 kB
SReclaimable:     257584 kB
SUnreclaim:       769012 kB
KernelStack:       76288 kB
PageTables:       213164 kB
SecPageTables:         0 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    77050132 kB
Committed_AS:   52842352 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      484944 kB
VmallocChunk:          0 kB
Percpu:            64512 kB
HardwareCorrupted:     0 kB
AnonHugePages:      8192 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:     10240 kB
FilePmdMapped:      4096 kB
CmaTotal:              0 kB
CmaFree:               0 kB
Unaccepted:            0 kB
HugePages_Total:   20000
HugePages_Free:     9969
HugePages_Rsvd:      124
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:        40960000 kB
DirectMap4k:     5503844 kB
DirectMap2M:    20938752 kB
DirectMap1G:    40894464 kB

Could Hugetlb: 40960000 kB be the problem?

Is the first sample immediate after boot?
In which case it will not include the memory used by services as they warm up.

Does that increase keep going after a further 4 hours?

What i usually do in situations like this is sample every 10m (or every hour) for 24 hours so that I have lots of samples to investigate.

1 Like

To be fair, this does not look like a normal server use case, as the screenshot shows a lot of workstation specific tasks.

The filesystem or video drivers are typical suspects for memory leaks, but it is best to isolate the problem as much as you can before jumping to conclusions.

The memory usage look way better after upgrade to Kernel 6.8.10, avaliable now increase to around 25Gi