Greetings folks!
For my bare-metal servers without PXE environment I wasn’t satisfied with the intricate re-provisioning using USB boot media. It also takes quite some time to copy the entire OS to the system’s drive even though re-creating /etc and /var could be quite efficient with an appropriate persistence strategy.
I have therefore – inspired by the factory reset discussion on GitHub – attempted to build a quick re-provisioning mechanism for already installed nodes. I have only tested it with simple configs but would like to get some feedback about my implementation and the limitations of such an approach.
At the heart is a service running between ostree-remount and local-fs. It runs if a /var/REPROVISION.ign file exists and reboots the system if it succeeds. Before executing the main script it moves the ignition file somewhere else to prevent boot-loops.
/etc/systemd/system/quick-reprovision.service
[Unit]
Description=Quick Reprovision
DefaultDependencies=no
Requires=quick-reprovision-cleanup.service
After=ostree-remount.service
Before=local-fs.target
ConditionPathExists=/sysroot/ostree/deploy/fedora-coreos/var/REPROVISION.ign
SuccessAction=reboot
[Service]
Type=oneshot
ExecStartPre=/usr/bin/mv /sysroot/ostree/deploy/fedora-coreos/var/REPROVISION.ign /sysroot/ostree/deploy/fedora-coreos/var/REPROVISION-FAILED.ign
ExecStart=/usr/bin/bash /usr/local/lib/quick-reprovision.sh
TimeoutStartSec=0
[Install]
WantedBy=local-fs.target
The service’s main script extracts the relevant boot options from the current deployment and then creates a new OSTree deployment with a fresh /etc from the same ref. It then installs the config.ign and ignition.firstboot files in the /boot directory. Since some files in /var are already going to be opened, everything in /var is moved to a “DELETE” sub-directory for now.
/usr/local/lib/quick-reprovision.sh
set -euo pipefail
export PATH="/usr/bin"
# get current delpoyment info
ostree_status="$(ostree admin status -J)"
current_check="$(jq -r '.deployments[] | select(.booted) | .checksum' <<< "$ostree_status")"
current_serial="$(jq -r '.deployments[] | select(.booted) | .serial' <<< "$ostree_status")"
# find current boot entry
for current_boot in /boot/loader/entries/*.conf; do
ostree="$(sed -En 's/^options.* ostree=([^ ]+).*$/\1/p' "$current_boot")"
depl="$(basename "$(readlink "$ostree")")"
[[ "$depl" != "${current_check}.${current_serial}" ]] || break
done
# extract options to take over
new_options="$(sed -En 's/^options ?(.*) ostree=[^ ]+ ?(.*) root=.*$/\1 \2/p' "$current_boot")"
# create new deployment
args=()
for opt in $new_options; do args+=("--karg=$opt"); done
ostree admin deploy --retain --no-merge "${args[@]}" "$current_check"
# configure ignition execution
mount -o remount,rw /boot
install -m 0600 -D /sysroot/ostree/deploy/fedora-coreos/var/REPROVISION-FAILED.ign /boot/ignition/config.ign
touch /boot/ignition.firstboot
mount -o remount,ro /boot
# clear state
mkdir -p /sysroot/ostree/deploy/fedora-coreos/var/DELETE
find /sysroot/ostree/deploy/fedora-coreos/var -mindepth 1 -maxdepth 1 \( '!' -name DELETE \) \
-execdir mv -t /sysroot/ostree/deploy/fedora-coreos/var/DELETE/ -- '{}' + || true
A clean-up service will remove this /var/DELETE directory during the next boot before /var is mounted to its final location.
/etc/systemd/system/quick-reprovision-cleanup.service
[Unit]
Description=Quick Reprovision Clean-up
DefaultDependencies=no
After=sysroot-ostree-deploy-fedora\x2dcoreos-var.mount
Before=var.mount
ConditionPathExists=/sysroot/ostree/deploy/fedora-coreos/var/DELETE
[Service]
Type=oneshot
ExecStart=/usr/bin/rm -rf /sysroot/ostree/deploy/fedora-coreos/var/DELETE
TimeoutStartSec=0
A convenience script validates a new config’s JSON format and places it at /var/REPROVISION.ign.
/usr/local/sbin/quick-reprovision
#!/usr/bin/bash
set -euo pipefail
# parse args until positional
args="$(getopt -l now,help -o nh -- "$@")"
eval set -- "$args"
now=false
while [[ "$1" != '--' ]]; do
case "$1" in
-n | --now ) now=true;;
-h | --help ) echo 'usage: quick-reprovision [-n|--now] [-h|--help] [IGNITION-FILE]'; exit;;
esac; shift
done; shift
# only continue if root
(( EUID == 0 )) || { echo >&2 '[error] root privs required'; exit 1; }
# handle positional args
(( $# < 2 )) || { echo >&2 '[error] excess positional params'; exit 1; }
path="${1:-"-"}"
[[ "$path" != '-' ]] || path="/dev/stdin"
# read config and make sure it is valid JSON object
config="$(< "$path")"
jq -er 'type == "object"' <<< "$config" >/dev/null 2>&1 || { echo >&2 '[error] invalid JSON'; exit 1; }
# write configuration
echo "$config" > /var/REPROVISION.ign
# reboot if requested
! $now || systemctl reboot
I can now remotely apply a new ingnition file by running the following.
ssh core@coreos-machine 'sudo quick-reprovision --now' < config.ign
If the new config messes things up, I can boot into the old deployment to get back my old /etc. My /var will already be purged though.
Once it’s honed I’m planning to publish it as a pyromaniac library to be easily imported into ones configs.
What are your thoughts?