OKD on silverblue by default?


#1

Hello,

So today and yesterday, I am participating in a small Ansible hackfest in Paris (hosted by Scaleway.com, sponsored by Red Hat), and I decided to do as much as possible with my Silverblue laptop, and not take the easy road of using my regular work laptop for that.

So I had plenty of time to wonder about pet containers, workflow, etc during the morning presentation, and I did continue to ponder today, when I had a interesting idea while showering, interesting idea that I want to share with people here.

My main concern with the pet container workflow is how to keep them updated. For example, if I start to use Dockerfile to create them, I would need to download new version of the base image, then rebuild manually, make sure I remember the tag I used, etc. Then I wondered about running cron jobs inside them, as I have a rather complicated mail setup for offline mail reading that I want to move to containers and would like to automate that (offlineimap + dovecot + ssh over tor, but explaining it and the why is for another post).

And then I remember that part of those problems are already solved by Openshift/OKD (I will use OKD since that’s the name of the upstream project now). As I guess non system admins are not familiar with OKD (previously named Openshift Origin), let me explain quickly what the software do.

So OKD is is a version of Kubernetes, adding a few features on top of the existing system. Most of those extra features find their way to Kubernetes somehow. In turn, Kubernetes is a system to manage containers deployment based around the idea of declarative configuration for the state of your cluster.

For example, you would declare how to get from a git repository to a container, and then OKD/Kubernetes would do the build, the deployment (replacing the containers, dealing with the network, routing, etc). It also deal with automated restart in case of crash, of linking related container together, but that’s more for production hosting than what we need. This is mostly used for hosting server applications, bringing automation and best practices, and a common language for portability around different providers.

And so, a system to rebuild and deploy containers without human intervention is IMHO something we would need. So what about having OKD installed by default, with a few base images, and using oc rsh to connect to the containers for pet containers.

This would bring several benefits.

First, it would solve some issues from Openshift Origin on Silverblue, story of one thousand cuts . If the system is preinstalled, then people can use it right away.

Second, it solve the pet containers rebuild issue. We can tell to people to use a build to create their container, and if the ImageStream are configured right, all should automatically rebuild and be usable with oc rsh. It also guide people to the “right” approach, by using a file to describe the containers rather than doing it manually.

Three, it do solve the problem of persisting state. OKD/Kubernetes do manage volumes, since containers content are removed on restart, and in the pet container use case, we would have to solve that somehow too. So this could help to have that framework available and ready to be used, especially since this integrate with SELinux, and so deal with the same problem we would have to solve anyway.

Fourth, it solve the issue of running cron job in a image, since that’s also a feature of OKD (moved to Kubernetes now). That is my exact use case of “I need to run stuff in a container with cron”. Of course, I can also try to run cron inside, but then, we need a process manager inside the container, and I am not sure systemd support it. I can also run cron outside, but that require to layer crond (and I want to avoid layering). Finally, I can use systemd, but systemd need special configuration to run a timer scheduled service when the session is not opened.

Fifth, there will be a a opportunity for a compelling story regarding deployment. For example, Fedora could provides 3 containers for rust, 1 for the SDK (so interactive pet container), one to build (the s2i container, in OKD lingo), and 1 to deploy in production. Even better, since the system is automated, one user could customize the SDK pet container by just creating a new container that depend on the first, and so get the benefit of having the SDK maintained and yet add software on top. And then, the user story could be to use the SDK and push the built container directly to a registry, to production.

And as one of the target of Fedora workstation is developers ( https://fedoraproject.org/wiki/Workstation/Workstation_PRD ), and since it seems that container based deployment is where the industry is moving (based on the attendance of various meetups and events I have done since 4 years), I think that offering easy access to such workflows would be aligned with the goal of Fedora Workstation. And since it seems that Kubernetes is the big winner in that space, I think that should be the system used.

Now, Kubernetes for now do not seems to offer a workflow to build containers, even if I heard that is supposed to happen. But that’s one of the feature that OKD provides and the one we would need, hence the proposal to use it.

The biggest issue I see with that approach is that OKD is not laptop friendly. I did deploy it, and it took 10% CPU just because the various servers where chatting to each others to check they are ok. While I am all in favor of self care and being social even for go software, this is a no go as far as I am concerned for my laptop.

Another issue is that preinstalled for now mean “being as a rpm” and that would requires a lot of work. Or have a way to get it started from a container, and that’s also some work.

But in the end, I think we can solve multiple problems without reinventing the wheel by reusing industry standards.

So, what do people think of the idea ?


#2

I honestly think this might be a good long term goal but it’s going to take a lot of work to get there and people might not want all of openshift origin (OKD) just for this. An intermediate step that I can think of is having your host (managed by rpm-ostree) and then having one single pet container (managed by dnf). So when you upgrade the host you just dnf update -y in the container too, like you would have a classic system. This intermediate step might be a crutch until we get to the longer term goal??


#3

Then if the majority of my work is done in a container that I will have to use like a regular Fedora system, I can as well install a regular Fedora system, so I do not have to deal with the sandbox limitation and have twice the amount of system administration.


#4

Then if the majority of my work is done in a container that I will have to use like
a regular Fedora system, I can as well install a regular Fedora system,

One of the attractions of using a container on Silverblue is that you can tinker as much as you want inside the container without worrying about breaking your host OS. I think that’s a pretty big sell. Currently, on a regular Workstation, if one messes around inside /usr and /etc, then there’s no guarantee that the machine will still continue to work in a meaningful way.

It’s also easy to throw away a “dirty” development environment and rebuild it from scratch. Months and years of messing around with a mix of software built from source on top of OS binaries can lead to weird ABI mismatches and what not. It’s sometimes necessary to be able to “reset” all that and start afresh.

(There’s also the possibility that there might not be a “regular” Workstation once Silverblue hits the mainstream, but let’s ignore that.)

so I do not
have to deal with the sandbox limitation and have twice the amount of system administration.

Umm… forgive my ignorance. What’s the sandbox limitation?


#5

Every time I’ve looked at OpenShift, either “Origin” or the Onlie PaaS, I’ve come away overwhelmed by the magnitude of the learning curve and the minimal benefits. That’s true of Platform-as-a-Service in general; I don’t need yet another way to deploy a self-hosted WordPress blog or a Rails or Django app.

Back when all my friends were excited about Heroku and later Cloud Foundry, I thought, “Hey, this looks cool!” Then again, they’ve all moved on to serverless. :wink:


#6

IMHO it is quite over-killing to have a full OKD running just for pet containers, it raises the bar for the hardware requirements too, can you imagine to do a oc cluster up on a 6-years-old laptop with 4gb RAM? :sweat_smile:


#7

One of the attractions of using a container on Silverblue is that you can tinker as much as you want inside the container without worrying about breaking your host OS.

Just last week, being a Fedora noob I managed to screw up my system by re-linking some of the shared libraries. It wasn’t hard to repair once I figured out what was broken, but I much rather wouldn’t had those kind of worries.
Btw, just installed silverblue on my laptop, first tried to do that with livecd-iso-to-disk and it didn’t boot, had no idea what’s up with that. Anyway I used the old and trusty disk destroyer and it’s working fine.


#8

True, but if I do nothing on the main system but start containers, then a breakage in the pet container where I do work will likely impact me as much as having my main system broken in a regular system. Granted, i would be in a better position to fix it, as I would have a browser, plenty of tools to look at it, and likely a way to salvage files. But if my pet container is broken because I have been careless, I can’t work like I wasn’t able if my system was broken. So I kinda feel we are just pushing the problem one layer further.

Yup, but that goal is also going against the idea of a pet container, since people will likely customize in a way that make reinstallation painful to them. That’s what I am trying to do, and if that’s annoying for me to throw away after 2 or 3 hours of tinkering, I can see how people will be annoyed after 2 or 3 months.

In the end, that’s already what we could do with our desktop already, and I think the problem was never “initial installation is annoying”, more the post installation part where we have to recustomize everything. At least, that’s my perception.

I think I was unclear on this, my bad. Let me explain a bit more with details.

So, first example. I was doing some CTF (Capture the flag, a security challenge where you have to try to exploit vulnerabilities for fun) around Bluetooth ( http://www.hackgnar.com/2018/06/learning-bluetooth-hackery-with-ble-ctf.html ). I decided to do quick and dirty pet container for that, mostly various “make && make install” to play with things and I hit the first wall. It turn out that bluetooth use a specific socket type blocked by docker/podman, and so I have to use the right options for that. And that’s what I mean by sandbox limitation, aka me being limited by the podman/docker sandboxing.

Another example, I am coding in python for $work. Good, so I want to deploy my flask prototype and this requires to deploy postgresql. While I could use sqlite for some projects (and do because this is easier), sqlite do not support postgresql features I use (stored procedure for a start), so I can’t use it. So I go in my container, do pip install, etc. And then figure how do I start postgresql. In my container, I can’t use systemd (because it doesn’t play nice), and there isn’t much built-in support for starting 2 process in a clean way. Ideally, I should do 2 containers. But then, I start to be blocked because the 2 containers are living in separate world wrt to filesystem (so no socket sharing), or network wise. So I have again to poke holes in the sandbox created by the namespaces used by podman/docker while I don’t in a regular distribution.

And that’s not just when I am dealing with my own code. I was also trying to do some wordpress customization (again for a personal project), and wordpress requires mysql. So same issue, and I did a oc cluster up + deployment script to get it running (that’s the story behind Openshift Origin on Silverblue, story of one thousand cuts)

A third example, my post on Uplink: My adventure of doing a flatpak package of Uplink

There is the part about bwrap needing --privileged (which is “bypass the sandboxing” super flag), and the part about fuse and --disable-rofiles-fuse. Both of them are issues created by the sandboxing and namespace and stuff.

And that’s just me on top of my head, and my experience doing containers on Silverblue. 1 for playing with some low level stuff very much in the hobbyist tinkerer market (I could tell about how flashing a arduino firmware also requires --privileged), 1 for doing packaging work (so kinda contributors workflow), and 1 for a regular python/php development.

And then, there is also where the sandbox (this time for flatpak) do cause issue with some specific feature of software. For example with Inkscape Inkscape on Silverblue, a few caveat to keep in mind but also people report issue with IDE: Developing applications using Flatpak-packaged editors/IDEs

Again, this is kinda due to the sandboxing, and we have to constantly find way around it. I know this is a hard problem and kinda like SELinux and firewalls faced in the past.


#9

So while I do agree that ressources is something to look at, I looked at my laptop, and there is 4.5G to 5G of ram being used at the moment. It run 1 instance of firefox nightly with 5 tabs (granted, 1 is a training video on javascript, so I suspect the video is somewhere in memory), and 1 regular firefox with almost no tab open (since I did restart the laptop). So I would postulate that 4G of ram is already out of what is supported by a typical workflow.

But this do bring the question of ressources we are aiming for. Cause while the issue of ressource is valid, we do not really question the existing use. I am sure we could shave some megas here and then. For example, have libvirtd started on demand, the same goes for firewalld, and cups. But in the end, we can’t do miracles either, if the memory is taken by gnome shell and firefox.

So, in order to know how much we are discussing, I did tried again a “oc cluster up” installation (spoiler, it failed with new error messages). So from around 5.1G of memory used, it went to 6.0G. Out of all of this, I suspect there is a few components that could be removed if the purpose is just “automated rebuild of container”, such as not starting the webconsole by default, no need for registry (or start it on demand again), etc, etc. Each could shave 100 or 200M.


#10

if I do nothing on the main system but start containers, then a breakage
in the pet container where I do work will likely impact me as much as
having my main system broken in a regular system. Granted, i would be
in a better position to fix it, as I would have a browser, plenty of tools to
look at it, and likely a way to salvage files.

I guess the specific outcome depends on the exact failure mode, and how we define work.

Like you said, one would still have a browser. Or in other words, if someone messes around with their libc.so inside the container, they can still expect their host OS to boot and log them into a graphical shell. So last night’s hacking session wouldn’t prevent a user from logging in next morning to check email.

However, if someone changed LD_LIBRARY_PATH in their .bash_profile to point to a broken prefix that’s visible from the host, then it might break their graphical log-in. But he/she can still boot and log in as root through a virtual console and unbreak it.

Since the home directory from the host is bind mounted into the pet container, recovering its contents is trivial. One could also spin up a second container and carry on until he/she has time to figure out what went wrong with the other one.

It’s also easy to throw away a “dirty” development environment
and rebuild it from scratch.

Yup, but that goal is also going against the idea of a pet container,
since people will likely customize in a way that make reinstallation
painful to them. That’s what I am trying to do, and if that’s annoying
for me to throw away after 2 or 3 hours of tinkering, I can see how
people will be annoyed after 2 or 3 months.

I understand that it can be useful to be able to automate how the containers are setup. I agree on that. :slight_smile:

I was trying to say that it’s possible to have multiple pet containers co-exist side by side. eg., one for Fedora 28 and one for Fedora 29 on a Fedora 27 host; or throw away an old container that’s gotten messed up with strange packages from 3rd party repositories and fails to update. It’s possible to do the same today by setting up different prefixes, but one would still need some tool to switch the prefixes because doing it purely manually using .bash_profile would be too onerous.