Demo: Boot Fedora CoreOS in a GitHub Action from a Butane config

I managed to boot Fedora CoreOS with vboxmanage (a VirtualBox command) from a Butane config in a GitHub Action. It’s more of a proof-of-concept right now. If anyone is interested: https://github.com/eriksjolund/fedora-coreos-vm/commit/052fb7bcbcea9831601e994cef9e6fc2ea12ea8e
As I understand it the VM is running hardware accelerated.

I’ll try to clean up the code during the coming weeks. The original code came from

Here is a sketch of the architecture:

  1. Download metadata for the stream
  2. Extract the iso URL
  3. Download the iso (possibly from the GitHub cache)
  4. Boot the iso (with the vboxmanage command from VirtualBox)
  5. Use optical character recognition (pytesseract) to find the text Press Tab for full configuration on the terminal
  6. Press tab (with vboxmanage)
  7. Add kernel arguments (with vboxmanage)
  8. Press enter (with vboxmanage)
  9. The guest boots and downloads /first.ign
  10. The file first.ign specifies that another ignition should be merged: user.ign
  11. Download /user.ign
  12. first.ign specifies that a systemd service should run (type=oneshot)
    after sshd.service. The service will execute
    curl http://10.0.2.2:8082/notify-host-that-sshd-is-ready
  13. After receiving the request /notify-host-that-sshd-is-ready the host
    executes a few commands on the guest via ssh.
2 Likes

Nice! You can use coreos-installer iso kargs modify to inject the kernel arguments directly into the ISO image, which should allow you to skip the OCR and scancode injection.

If you’re going to be using coreos-installer anyway, you could also use coreos-installer download -f iso rather than hand-parsing the stream metadata. That will also handle GPG verification for you.

Thanks for the tip, yes using coreos-installer iso kargs modify would be an option. What complicates things somewhat is that I need to use the GitHub-hosted runner macos-10.15 to be able to run VMs hardware accelerated. Hardware acceleration is not supported in the other GitHub-hosted runners. Unfortunately coreos-installer does not support macOS 10.15. (clarified in GitHub issue). I see some possible workarounds for how to use coreos-installer with GitHub Actions.

Do you have any advice of how to design this GitHub Action so that it would be resource efficient?

The GitHub-hosted runner macos-10.15 has limited resources

  • 3-core CPU
  • 14 GB of RAM memory
  • 14 GB of SSD disk space

I’m speculating that these are some possible design alternatives:

a. boot a live ISO directly
b. boot a live ISO, install and reboot
c. boot ipxe ISO and then boot Fedora CoreOS via PXE.
d. boot ipxe ISO and then boot Fedora CoreOS via PXE, install and reboot

It seems the PXE functionality in VirtualBox is not open source:
https://www.virtualbox.org/manual/ch01.html#intro-installing
(I hope that booting an IPXE iso from ipxe.org could be an alternative to get PXE functionality in a GitHub Action)

I’ve started to redesign the demo so that it can start multiple Fedora CoreOS VMs. It’s not quite working yet but hopefully soon.

Makes sense. Booting via PXE would avoid both coreos-installer and OCR, so I’d try the iPXE ISO. From some quick Googling, it seems you can put boot files in ~/Library/VirtualBox/TFTP and VirtualBox will set DHCP next-server and provide a TFTP server. It’s possible that that part will still work even without the Intel boot ROM. tftp URLs work with the ignition.config.url kernel argument, so you could put the Ignition config there too.

If the goal is just to run some CI inside Fedora CoreOS, consider saving some I/O by skipping the OS install and running from the live system. (Note, though, that PXE boot takes ~700 MB more RAM than ISO boot.) You can still use Ignition to create a persistent /var partition if your use case needs a lot of storage.

Thanks for the advices. OK, supporting both ISO and PXE seems to be the way to go if they have different performance characteristics. (I haven’t tried out PXE yet, but I plan to)

I’m thinking that this GitHub Action could be useful for CI, demos and bug reports. It would be nice to be able to create a demo for some software and at the same time be able to use the demo for automatic tests in continuous integration (CI) and potentially catch new bugs. It would be easy to provide instructions for how to reproduce such bugs when writing bug reports.

I added some new functionality so that it is now possible to configure how many VM instances each Butane should have.

Today I made a test where a VM ran a Nginx container with Podman and two VMs ran curl to fetch the default web page from the Nginx webserver. It worked!

The networking part of the code needs improvement. I would like to get DHCP working. Currently the IP address is provided via a dracut kernel command line option ip=... (see man dracut.cmdline).

I would like to highlight one design aspect that is quite fortunate: Both GitHub Action workflows and Butane use YAML. There is no need to wrap Butane configurations in some other format. Instead they can be provided as subtrees in the input to the GitHub Action. It is possible to copy-paste the Butane config from somewhere else, and then just add some extra indentation to use it in the GitHub Action workflow.

An interesting design question is whether or not the ignition files need to be ready before starting the VMs. In case they are not ready one could hope that they will be ready before any timeout would occur during the HTTP requests.

I found some timeout configuration parameters in the Butane file format specifications (for instance Fedora CoreOS Specification v1.4.0) but I couldn’t find any timeout configuration for the ignition.config.url kernel parameter.

If anyone is curious, currently the GitHub Action takes about 1 minute and 45-55 seconds to run. Allocating the GitHub-hosted runner might also take some time (often about 15 seconds).

One could hope that PXE could speed up booting. Besides having no dependencies on coreos-installer and OCR, the VM could start booting
as soon as the kernel and initramfs files have been downloaded. I guess the big rootfs file does not need to be ready from the very beginning of the boot process.

Regarding the speed of the GitHub cache:

When I tested half a year ago I noticed that it was slightly faster to download the Fedora CoreOS ISO from the official URL than it was to download a previously cached Fedora CoreOS ISO from the GitHub cache. Currently the GitHub Action uses the GitHub cache ( @actions/tool-cache). I will do some testing to see which way is the fastest.

About the status of the code

The code is still experimental and in a flux. I’ll need to clean it up.

Awesome!

If the HTTP server is not available, Ignition will retry indefinitely. If the server is available but returns 404, Ignition will fail immediately.

Historically, the Fedora CoreOS initramfs did not retry fetching the rootfs. Current versions retry indefinitely, but the same caveat applies: if the server is running but returns 404, the initramfs will give up.

I’m curious as to why you want to use CoreOS to run your Github actions rather than a container?

Booting up a VM could be useful if you need some new kernel feature that is not available in the available GitHub-hosted runners.

There is for instance no GitHub-hosted runner that runs cgroupsv2. (ubuntu-20.04 is running cgroupsv1). That’s why the software project Kind (Kubernetes IN Docker) boots up a VM with Vagrant to be able to test its software with cgroupsv2 in a GitHub Action workflow.

Another use case is to demonstrate network filesystems. I would like to write an example with two butane configs (one for an NFS client and one for an NFS server) and launch them in this GitHub Action. Another example could be to demonstrate a Slurm cluster.

One cool thing is that it would be possible to write such an example as a single file because the butane configs can be given inline.

Ideas for other GitHub Actions

I think it would be useful to also create GitHub Actions for

  • building a Fedora CoreOS image
  • building kernel RPMs

to be able to combine all of the GitHub Actions to demonstrate new Linux kernel developments (e.g. Bcachefs).

Just a short update on some new findings:

Running

brew install -qf tesseract butane
pip3 install -q pytesseract

takes about 35-40 seconds

Running

brew install -qf butane

takes about 15-20 seconds.

I would guess it should be possible to speed up the ISO boot method by about 20 seconds by removing the OCR functionality (i.e. removing the software packages tesseract and pytesseract). Maybe it would be possible to boot by just pressing the keys at the right moment, for instance by waiting a certain number of seconds before pressing the keys. Another possibility could be to repeatedly take screenshots with

VBoxManage controlvm <vm name> screenshotpng <filename>.png

and compare the screenshots to a previously stored “expected” screenshot. The comparing would then not be made with OCR but instead some other simpler method. An idea could be to remove all metadata from the PNGs and just do a

/usr/bin/diff screenshot.png expected.png

(I know too little about PNGs to know whether that would work)