Fedora LiveOS root system and available RAM

I’ve looked into this a little further and it does look like a genuine bug.

Booting a Fedora Linux 39 LiveCD and running the lsblk and losetup commands shows that it did try to create an overlay. But the overlay file appears to have been “deleted”.

[root@localhost-live ~]# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0  1.9G  1 loop 
loop1         7:1    0  7.6G  1 loop 
├─live-rw   253:0    0  7.6G  0 dm   /
└─live-base 253:1    0  7.6G  1 dm   
loop2         7:2    0   32G  0 loop 
└─live-rw   253:0    0  7.6G  0 dm   /
sda           8:0    0  149G  0 disk 
├─sda1        8:1    0    1G  0 part 
└─sda2        8:2    0  148G  0 part 
sdb           8:16   0  149G  0 disk 
├─sdb1        8:17   0    1G  0 part 
└─sdb2        8:18   0  148G  0 part 
sr0          11:0    1    2G  0 rom  /run/initramfs/live
sr1          11:1    1 1024M  0 rom  
zram0       252:0    0  3.7G  0 disk [SWAP]
[root@localhost-live ~]# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE            DIO LOG-SEC
/dev/loop1         0      0         0  1 /LiveOS/rootfs.img     0     512
/dev/loop2         0      0         0  0 /overlay (deleted)     0     512
/dev/loop0         0      0         0  1 /LiveOS/squashfs.img   0     512

I have a hunch that /overlay was created in the initramfs and then lost when the “switch root” to /sysroot happend. There is, however, a workaround. If you look here, you can see that a different code-path is followed if you add rd.live.overlay.thin on the kernel command line. The alternate code-path stores the overlay on a tmpfs filesystem and yields the following results when booted.

[root@localhost-live ~]# cat /proc/cmdline 
BOOT_IMAGE=/images/pxeboot/vmlinuz root=live:CDLABEL=Fedora-WS-Live-39-1-5 rd.live.image quiet rhgb rd.live.overlay.thin
[root@localhost-live ~]# lsblk
NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0                 7:0    0  1.9G  1 loop 
loop1                 7:1    0  7.6G  1 loop 
├─live-rw           253:1    0  7.6G  0 dm   /
└─live-base         253:2    0  7.6G  1 dm   
loop2                 7:2    0   32G  0 loop 
loop3                 7:3    0  3.2G  0 loop 
└─live-overlay-pool 253:0    0   32G  0 dm   
  └─live-rw         253:1    0  7.6G  0 dm   /
loop4                 7:4    0   32G  0 loop 
└─live-overlay-pool 253:0    0   32G  0 dm   
  └─live-rw         253:1    0  7.6G  0 dm   /
sda                   8:0    0  149G  0 disk 
├─sda1                8:1    0    1G  0 part 
└─sda2                8:2    0  148G  0 part 
sdb                   8:16   0  149G  0 disk 
├─sdb1                8:17   0    1G  0 part 
└─sdb2                8:18   0  148G  0 part 
sr0                  11:0    1    2G  0 rom  /run/initramfs/live
sr1                  11:1    1 1024M  0 rom  
zram0               252:0    0  3.7G  0 disk [SWAP]
[root@localhost-live ~]# losetup
NAME       SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE                    DIO LOG-SEC
/dev/loop1         0      0         0  1 /LiveOS/rootfs.img             0     512
/dev/loop4         0      0         0  0 /initramfs/thin-overlay/data   0     512
/dev/loop2         0      0         0  0 /overlay (deleted)             0     512
/dev/loop0         0      0         0  1 /LiveOS/squashfs.img           0     512
/dev/loop3         0      0         0  0 /initramfs/thin-overlay/meta   0     512

We are not quite there yet though because the table for the live-rw device is using the minimum size of the two underlying devices (i.e. the size of the original squashfs.img from the DVD) instead of the full size of the writable overlay (this appears to be a second bug).

[root@localhost-live ~]# dmsetup table
live-base: 0 15876096 linear 7:1 0
live-overlay-pool: 0 67108864 thin-pool 7:3 7:4 1024 1024 0 
live-rw: 0 15876096 thin 253:0 0 7:1

If you are quick about it, you can get away with using dmsetup to suspend the live-rw device, fix its sector mappings to use the full allocation of live-overlay-pool and then resume the live-rw device.

Here is a script you should be able to copy-and-paste into a terminal session running in a LiveCD environment to accomplish just that (thanks to Right To Your Own Devices By Kapil Hari Paranjape).

Paste and execute all these lines at once:

bash -e <<- 'END'
shopt -s lastpipe
dmsetup table live-rw | readarray -d ' ' -t table
table[1]=$(blockdev --getsz "$(udevadm info -rq name "/sys/dev/block/${table[3]}")")
dmsetup suspend live-rw
echo "${table[*]}" | dmsetup load live-rw
dmsetup resume live-rw
END

Before dmsetup table fix:

# dmsetup table
live-base: 0 15876096 linear 7:1 0
live-overlay-pool: 0 67108864 thin-pool 7:3 7:4 1024 1024 0 
live-rw: 0 15876096 thin 253:0 0 7:1

After dmsetup table fix:

# dmsetup table
live-base: 0 15876096 linear 7:1 0
live-overlay-pool: 0 67108864 thin-pool 7:3 7:4 1024 1024 0 
live-rw: 0 67108864 thin 253:0 0 7:1

And there is still one last thing to do – resize the filesystem to use full size of the (now larger) underlying device.

[root@localhost-live ~]# lsblk /dev/mapper/live-rw
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
live-rw 253:1    0  32G  0 dm   /
[root@localhost-live ~]# df -h /
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/live-rw  7.4G  5.8G  1.6G  79% /
[root@localhost-live ~]# resize2fs /dev/mapper/live-rw
resize2fs 1.47.0 (5-Feb-2023)
Filesystem at /dev/mapper/live-rw is mounted on /; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 4
The filesystem on /dev/mapper/live-rw is now 8388608 (4k) blocks long.

[root@localhost-live ~]# df -h /
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/live-rw   32G  5.8G   26G  19% /

Whew :sweat:, so many bugs. :confused:

2 Likes

You know that “deleted” means that only the file name is deleted, and the file itself is still intact as long as something is using it.

Yes, I know a little about inodes. But I do not know if the access to that memory can be maintained across the switch root process and the freeing of the initramfs resources that happens between when that file was created and when losetup reported it as “deleted”. In any case, I did try simply running resize2fs /dev/mapper/live-rw before I went to the more elaborate steps I outlined. It did not work.

Edit: Although, I just realized that I neglected to look at the device mapper table. It may be possible to skip the step of adding the rd.live.overlay.thin kernel parameter (I haven’t tried that yet). But I believe the rest of the steps would still be required.


Being curious, I googled this and found this interesting post on stackexchange:

So, for whatever reason the issue seems to be that if you mount anything on the initramfs other than at /new_root or anything else that was pre-mounted, the initramfs is not cleared from memory. No idea why that is.

So I looked at the scripts on the Arch Linux boot ISO and found that it’s basically doing something similar to my idea, but it’s creating all its temp mount directories under /run. Since /run is a tmpfs to begin with, this ends up saving a few steps anyway.

Adapting my scripts to do the same, I was ultimately able to create an ISO that boots in 128MB RAM and leaves 80MB of free RAM for applications. Not bad.

So the trick is to mount anything you need to mount under /run, and then switch_root should properly clear the root filesystem.

So yeah, the kernel is clever enough not to free the memory associated with that /overlay file in the initramfs across the switch-root process. But based on that stackexchage comment, it sounds like there are additional penalties for doing it that way. Appearently, no memory from the initramfs archive will be freed when that reference is held and that may be wasting however much RAM that unpacked initramfs filesystem requires. Also, in retrospect, the kernel must be holding some memory somewhere or else you would not be able to write anything to the / filesystem of the LiveCD. IMHO, it would be far better for such things to be stored on tmpfs filesystems or under /sysroot so that the memory associated with the initramfs could be freed for other uses.

FYI, I’ve submitted a PR upstream to get this fixed: with thin provisioning, use the overlay size, not the base image size by gregory-lee-bartholomew · Pull Request #2604 · dracutdevs/dracut · GitHub

1 Like

Wow, really great. Thank you so much for your work. Finally …

Consider using the OverlayFS based overlays with rd.live.overlay.overlayfs. See
Booting live images. “On non-vfat-formatted devices, a persistent OverlayFS overlay can extend the available root filesystem storage up to the capacity of the LiveOS disk device.” or a non-persistant overlay can use available RAM in the /run tmpfs at /run/overlayfs. There is also the rd.live.ram option to move the base root filesystem into RAM.

1 Like

Thanks all for your help !

@fgrose :

rd.live.ram is a nice alternative, but might have its own (new?) issues

rd.live.overlay.overlayfs : this command might also have issues

Besides regarding rd.live.overlay.overlayfs
I’m still finding, after some years, the following documentations: Booting live images confusing and difficult to understand among all the options (the short descriptive one-liner sentences do elude me…)

How to combine above docs and crossing them with this docs: Fedora Wiki LiveOS ?
Which relates to temporary storage in RAM, and which relates to persistent storage on disk ? with/without overlay storage space, whether temporary or persistent ?:

  1. temporary storage is prepared for the system in RAM” is it achieved with this single following parameter rd.live.overlay.size ?

By default, a 32 GiB, in-memory, copy-on-write, system overlay in a sparse file is prepared (see File Systems below). The rd.live.overlay.size kernel command line option may be used to set a different, temporary, overlay size. Since the temporary overlay is a sparse file in a tmpfs, a large size, even larger than available memory, may be specified and only what is needed will be allocated as needed.

  • rd.live.overlay.size alone, and if specified rd.live.overlay.size=<size_MiB> ?
  • what happens when real available RAM (I have a total 32Gb) is below the theoretical 32Gb when the parameter isn’t specified with <size_MiB> ?

If so, this might be the solution to my initial question in this thread… but some is still unclear:

  1. How is persistent storage managed with one or more of those options ?

Enables the use of the OverlayFS kernel module, if available, to provide a copy-on-write union directory for the root filesystem. OverlayFS overlays are directories of the files that have changed on the read-only base (lower) filesystem. The root filesystem is provided through a special overlay type mount that merges the lower and upper directories. If an OverlayFS upper directory is not present on the boot device, a tmpfs directory will be created at /run/overlayfs to provide temporary storage. What command parameter is extactly needed so one can achieve this temporary storage?

Persistent storage can be provided on vfat or msdos formatted devices by supplying the OverlayFS upper directory within an embedded filesystem that supports the creation of trusted.* extended attributes and provides a valid d_type in readdir responses, such as with ext4 and xfs. (I did not understand this)
On non-vfat-formatted devices, a persistent OverlayFS overlay can extend the available root filesystem storage up to the capacity of the LiveOS disk device. How one can achieve this ? Most importantly: rd.live.overlay.overlayfs=1 : is it temporary on RAM ? (as stated just earlier) or persistent as stated here ?

If a persistent overlay is detected at the standard LiveOS path, the overlay & overlay type detected, whether OverlayFS or Device-mapper, will be used. How is the persistent overlay triggered ? And how do we know if it will be OverlayFS or Device-mapper ?

  1. How about OverlayFS based overlay is related with the containing citation:

Another option for non-persistent storage is to use the rd.writable.fsimg

rd.writable.fsimg=1

Enables writable filesystem support. The system will boot with a fully writable (but non-persistent) filesystem without snapshots (see notes

  1. If snapshot/thin is a persistent storage option, what is its advantage ?

Enables the usage of thin snapshots instead of classic dm snapshots. The advantage of thin snapshots is that they support discards, and will free blocks that are not claimed by the filesystem. In this use case, this means that memory is given back to the kernel when the filesystem does not claim it anymore.

What about rd.live.overlay.cowfs=[btrfs|ext4|xfs] ?

Indeed help/clarification is welcome !

@fgrose I tell you my scenario: On Ubuntu live or whatever live system I start the live system, can install different software and test if it works on a specific computer. And how I like that software etc.

On Fedora: I start the live system install some software and then Fedora crashes, because out of memory. Although I am doing that on a 32 GB RAM computer, doesn’t matter. That is my problem. How can that be solved easily? It has been like this in Fedora for ages. Really annoying.

I think the other distro’s live image/installers are built with that purpose in mind, while Fedora’s Live Image/Installer never was. Meaning it (likely) wasn’t considered as important since the “everything installer” provides such installation flexibility for personal customization. While the Live Image was always viewed as a Fedora Workstation “try before you buy” way to get to know Fedora Workstation, which is Gnome-centric.

Your use case @qfghnrtd is very similar to mine.

According to the documentation: Booting live images. It should be as easy as to add to the kernel command line rd.live.overlay.overlayfs an thus get up to 32 GiB in RAM in filesystem space to install and test whatever you like.

But I get the above mentionned issues:

If the patch I submitted gets accepted, adding rd.live.overlay.thin on the kernel command line and then running resize2fs /dev/mapper/live-rw once the Live image has booted should be sufficient to make it work. Maybe those could even be made the default for future Fedora Live images.

1 Like