Afterburn (on vSphere) not functioning?

I am occasionally installing OKD clusters on vSphere UPI in environments without DHCP and thus need to set static IP configuration. I was doing this so far with
a) setting NetworkManager config file and host/hostname files through ignition and
b) modifying the virtual disks of the VMs before booting by creating ignition.firstboot files with the initial IP configuration so that merging remote ignition files works.

Now I read that FCOS now also supports afterburn and I thought I could replace b) with configuring afterburn to simplify the setup process a bit.

Unfortunately, I cannot get this to work. The documentation looks simple, it seems I simply have to set “guestinfo.afterburn.initrd.network-kargs” with a suitable “ip=…” configuration. But when I do so, that simply seems to be ignored. When the OKD bootstrap node boots it cannot merge the remote ignition file and complains about “network unreachable”.

What could I have missed? Is there something needed in addition that is not yet documented?
Did anyone have success using afterburn for such a use case?

I assume you’re talking about the docs here and here?

I’m not sure why that wouldn’t be working. Offhand you might try adding rd.neednet=1 to the configuration you pass to see if that unsticks things, but it shouldn’t be required.

The full logs from the system boot would probably give us a little more insight.

Yes, these were exactly the docs I was using.
I have already tried “rd.neednet=1” but it did not make a difference.

How do I retrieve “full logs from the system boot”?
You mean these are stored in the VM disk? Where?
Or do I need to capture console output somehow? How?
(I already tried attaching a virtual serial console but that did not work either.)

I see someone wrote this in Ask Fedora already in September but never got an answer:

Sounds exactly like my problem.

Does it drop you to an emergency shell? You could run journalctl and get them that way.

Serial console should work too, did you attach to it during boot?

Another option would be to modify the disk image with the ignition.firstboot file like you were doing in the past, but also provide the args via guestinfo.afterburn.initrd.network-kargs. Once the system is up we should at least be able to see some log messages about what Afterburn was trying to do.

For what it’s worth, the following works for me with FCOS 33.20201201.3.0 on vSphere 6.7. It appears that the FCOS documentation is incorrect in regards to the interface name. The interface name is “ens192”, not “ens9”. Hope this helps.

VM_NAME='fcos-staticip-test'
IFACE='ens192'
IPCFG="ip=10.103.2.95::10.103.0.1:255.255.255.0:${VM_NAME}:${IFACE}:off"

govc vm.change -vm "${VM_NAME}" -e "guestinfo.afterburn.initrd.network-kargs=${IPCFG}"
govc vm.info -e "${VM_NAME}"
govc vm.power -on "${VM_NAME}"

I’ve updated the value in a fork of the docs and generated a pull request to incorporate the fix.

I had noticed the ens9 vs. ens192 error and used ens192 but it did not work either.

BTW, I am trying this with FCOS 32.20201104 … will it work with that already, too?
I cannot use FCOS 33 because of another bug with it and OKD (/etc/resolv.conf problem).

I can try with ignition.firstboot also set, first…
So, I did it and retrieved the boot log with journalctl -k.

I can see the kernel arguments with networking parameters injected:

Dec 28 21:10:49 localhost kernel: Command line: BOOT_IMAGE=(hd0,gpt3)/ostree/fedora-coreos-a73d20d77e7ff7d9fb0d3c856edb9531f8eca8f8433bf61736da20befca2bbde/vmlinuz-5.8.17-200.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.firstboot rd.neednet=1 ip=193.149.36.208::193.149.36.254:255.255.252.0:bootstrap:ens192:none nameserver=193.149.36.118 ostree=/ostree/boot.1/fedora-coreos/a73d20d77e7ff7d9fb0d3c856edb9531f8eca8f8433bf61736da20befca2bbde/0 ignition.platform.id=vmware

However, these are the arguments from my ignition.firstboot file (in the ignition.firstboot and in the Afterburn arguments I have the same IP data but in the Afterburn string I terminated with “:off” while in ignition.firstboot with “:none nameserver=…”).

So the Afterburn arguments are simply not there. Does FCOS32 not yet support Afterburn?

Just for testing things out it might be worth seeing if you can get it working against F33 like @jaimelm just to make sure we’re not dealing with another bug.

You’ll need to paste the whole journal somewhere. The arguments injected by Afterburn come during the boot process. They won’t show on those first lines (the ones you’ve pasted) because they are “injected” by Afterburn, which runs later.

FCOS 32 works generally. Here’s a quick test I did.

[me@lsa-linux-dev bin]$ cat testHostStaticIP.sh
#!/bin/bash

LIBRARY=“Linux ISOs”
TEMPLATE_NAME=“fedora-coreos-32.20201104.3.0-vmware.x86_64”
VM_NAME=‘fcos-staticip-test’
IFACE=‘ens192’
IPCFG=“ip=10.103.2.95::10.103.0.1:255.255.255.0:{VM_NAME}:{IFACE}:off”
IP_ADDRESS=“10.103.2.95”

if ping_success=(ping -c 1 {IP_ADDRESS} > /dev/null); then
echo “Host is up at the desired IP.”
else
echo “Host not available”
fi

govc library.deploy --folder “/vm/Linux/FCOS/” “{LIBRARY}/{TEMPLATE_NAME}” “{VM_NAME}" govc vm.change -vm "{VM_NAME}” -e “guestinfo.afterburn.initrd.network-kargs={IPCFG}" govc vm.info -e "{VM_NAME}”
govc vm.power -on “${VM_NAME}”

while true; do
echo “Testing…”
if ping_success=(ping -c 1 {IP_ADDRESS} > /dev/null); then
echo “Host is up at the desired IP.”
break
else
echo “Host not available”
sleep 20
fi
done

Running it…

[me@lsa-linux-dev bin]$ ./testHostStaticIP.sh
Host not available
[28-12-20 19:57:16] Deploying library item…
Name: fcos-staticip-test
Path: /vm/Linux/FCOS/fcos-staticip-test
UUID: 42238b66-fc6d-f313-2669-0537e2f27fae
Guest name: Red Hat Enterprise Linux 7 (64-bit)
Memory: 4096MB
CPU: 2 vCPU(s)
Power state: poweredOff
Boot time:
IP address:
Host: vcloud6
ExtraConfig:
nvram: fcos-staticip-test.nvram
pciBridge0.present: TRUE
svga.present: TRUE
pciBridge4.present: TRUE
pciBridge4.virtualDev: pcieRootPort
pciBridge4.functions: 8
pciBridge5.present: TRUE
pciBridge5.virtualDev: pcieRootPort
pciBridge5.functions: 8
pciBridge6.present: TRUE
pciBridge6.virtualDev: pcieRootPort
pciBridge6.functions: 8
pciBridge7.present: TRUE
pciBridge7.virtualDev: pcieRootPort
pciBridge7.functions: 8
hpet0.present: TRUE
vmware.tools.internalversion: 0
vmware.tools.requiredversion: 10272
migrate.hostLogState: none
migrate.migrationId: 0
migrate.hostLog: fcos-staticip-test-7d277a8b.hlog
guestinfo.afterburn.initrd.network-kargs: ip=10.103.2.95::10.103.0.1:255.255.255.0:fcos-staticip-test:ens192:off
Powering on VirtualMachine:vm-14735… OK
Testing…
Host not available
Testing…
Host is up at the desired IP.

Here is the complete boot log from which I had cited the kernel args line:
http://www.ars.de/file/bootlog.txt
I have looked through it but nothing catched my eye that I would associate with Afterburn.

BTW: I also tried the same with RHCOS 4.5.8 where the docs now also say that it supports Afterburn but had the same result - did not work. So I may do some systematic mistake … but which?

I’ve saved the logs of successful deployments of FCOS 32 and 33 on vSphere with static IPs for comparison.

http://www.jaime4a2.org/random/fcos-32-static-ip-boot.txt
http://www.jaime4a2.org/random/fcos-33-static-ip-boot.txt

Maybe I’m missing something, but the dracut network configuration insertion doesn’t seem to actually happen in the deployment of your VM. For example, I have the following in my logs…

Dec 30 04:14:03 localhost dracut-cmdline[260]: dracut-32.20201104.3.0 (CoreOS) dracut-050-61.git20200529.fc32
Dec 30 04:14:03 localhost dracut-cmdline[260]: Using kernel command line parameters: rd.driver.pre=btrfs ip=10.103.2.95::10.103.0.1:255.255.255.0:::off BOOT_IMAGE=(hd0,gpt3)/ostree/fedora-coreos-a73d20d77e7ff7d9fb0d3c856edb9531f8eca8f8433bf61736da20befca2bbde/vmlinuz-5.8.17-200.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree=/ostree/boot.1/fedora-coreos/a73d20d77e7ff7d9fb0d3c856edb9531f8eca8f8433bf61736da20befca2bbde/0 ignition.platform.id=vmware

In fact, I’m not seeing dracut-cmdline run at all in your logs. It’s started by systemd, but never actually processes your input.

Question: Are you just adding the Afterburn network config info or are you also still doing “a” (making modifications to the vm via ignition)? Have you tried deploying a VM without any ignition modifications other than adding the core user? Can we see your ignition file?

Hope this is helpful.

The use case is a cluster node for a OKD cluster.

My understanding of Afterburn is that it only provides IP configuration for the initial boot/ignition process and that I still need to perform a static IP configuration via NetworkManager configuration file for the VM for the time after the initial boot/ignition.

Therefore I use two ignition files:

  1. the initial ignition file that is passed to the VM via vSphere guestinfo; this also contains the NetworkManager configuration file and /etc/hosts and /etc/hostname files and it “merges” then
  2. the OKD ignition file which was generated by openshift-install and this is merged remotely via HTTP.

The merging of the OKD ignition file remotely via HTTP requires that the initial boot/ignition already has a functioning network configuration. So far I had provided this via modification of the VMs virtual disk (creation of a ignition.firstboot file) in situations where there was no DHCP available and I needed to do static IP configuration. When I now read about Afterburn I thought that would be a better alternative for this step. Now I am trying to implement this.

I think that the use of ignition should not preclude Afterburn from working?
Because I will need both …

From your logs, you are already passing an ip=<...> parameter via bootloader kernel arguments (that’s the entry ending with :none). Custom kargs override Afterburn logic, so the guestinfo property is not used at all.

I’m not sure how you ended up with an OVA image first-booting with hard-coded custom kargs, but if you start from a fresh FCOS vmware image the Afterburn logic should work, as shown in @jaimelm’s case.

I was using additional IP configuration via ignition.firstboot file (modifying the VM disk before booting it) on advice from @dustymabe in order to be able to access the VM after boot at all.

So I’m in a Catch22 now. With ip= from ignition.firstboot the Afterburn does not kick in. But without ip= from ignition.firstboot the VM does not have any IP config and I have no way to acces it to find out what went wrong (why Afterburn still did not work). Any other idea how to solve this?

I understand that you intend to use ignition for further modifications, but was trying to narrow down why afterburn was not working for you at all. Stripping away all other modifications seems like a good way to do that.

I’ve always done my OKD installs in vSphere utilizing DHCP. A quick search for documentation shows that, at least as of a few months ago, static IPs are achieved through modifications of the image, not through multiple ignition merges. Is there perhaps a technical reason for this?

I’m not doing “multiple ignition merges” - just one. Like it was documented in OKD docs until recently. All I’m doing is adding the static IP config files to the root ignition files.
I would also rather like to use DHCP (reservations) in all cases.
But for most of our customers, static IP assignment is a non disputable requirement in their on premise networks for servers and k8s clusters.
But let’s not digress too much.

Just to repeat what I wrote above for that static IP case:
I boot VM disks where:

  • I made ONE modification, to ignition.firstboot, to add the static IP
  • then the NetworkManager configuration file, /etc/hostname and /etc/hosts are configured with the initial ignition file
  • the OKD ignition file is “merged” via HTTP from the helper http server (therefore I need working IP network in initrd stage).

That’s all. And I would just like to replace the first step with Afterburn because it sounds easier and “better”. But for that it needs to function properly. :slight_smile:

I’ll do a few tests with bare FCOS image deployments to leave out the OKD deployment first. But nevertheless it will need to work in the case described above, too or Afterburn is not useable for me and I need to stay with my previous method (modifying ignition.firstboot in the VM disk images before the first boot).

So I did a quick boot with a plain FCOS33 image with this simply sample ignition file:
http://www.ars.de/file/fcos33-ignition.yaml
With no DHCP server in the network, the system ended up with no networking.
I then started a DHCP server and then it got an IP and I was able to ssh into it.
I got this journalctl -k output:
http://www.ars.de/file/fcos33-bootlog.txt

I cannot see anything in there from Afterburn … :frowning:

The Afterburn config set in vSphere was:

What am I doing wrong? I just don’t see the mistake …