Zincati Version Control

Good Afternoon!

We’re currently in the process of migrating all of our nodes from CoreOS to Fedora CoreOS (FCOS), and one challenge that we’re facing is determining how to properly version control our updates.

Given that we have around ~20 CoreOS clusters spread across Dev, QA, and Prod, it sometimes will take us 2-4 weeks to complete an upgrade across all of the clusters, mainly due to a strict change management controls, limited maintenance window availability, and a staggered rollout approach.

With that said, our concern is that if we start an upgrade on a particular FCOS version, we want to ensure that the same version is deployed to all of our clusters. Given that Zincati polls the FCOS Cincinnati servers for the latest available release, our concern is that it is possible that we might encounter new FCOS versions being introduced mid-way through our internal upgrade cycle.

For CoreOS (Container Linux), we were able to work around this concern by implementing something ismiliar to the following gist where we’d download the desired CoreOS version’s update.gz file locally, create the necessary Omaha XML, and then point each CoreOS node’s update-engine service at these files:

Does anyone know if a similiar approach could be used for FCOS version controlling? If nothing exists, then I suppose the next best option is to look into deploying our own local FCOS Cincinnati server by reverse engineering something like:

Please let me know if you need any additional information.

Thanks so much!

This is a complex topic, and also one that has come up for OpenShift (which also uses Cincinnati) and users want to replicate the same upgrade graph internally. I think the current plan there is to ship the graph data as part of the release image.

Are you using https://github.com/coreos/airlock today? How are you managing the rollout?

(Ultimately you can always disable zincati and use whatever tooling you want to basically run rpm-ostree deploy $version and systemctl reboot too).

1 Like

This topic is also related to Fcos offline update server

Thanks for the quick replies @walters!

For now, we were planning to try and keep the upgrade architecture as simple as possible (avoiding airlock), by only enabling the Zincati automatic update (and immediate reboot) feature per node as maintenance windows become available for the cluster and then serially process through all the remaining intra-cluster nodes in this fashion until all nodes are updated to the desired FCOS version.

We were trying to avoid using rpm-ostree directly, as I thought the community had recommended against that, but that could be another potential option for us! Thanks!

I’ve just confirmed this (rpm-ostree deploy $version) solution appears to work, and this resolves our FCOS version control concern for now. Thanks again!

I think this fits into the larger topic of “auto-updates via a mirrored/proxied backend”, which we are tracking at https://github.com/coreos/fedora-coreos-tracker/issues/240.

In your case it looks like you want to use our backend for the ostree repository and updates metadata, but you’d want to have your own Cincinnati instance with custom policies.