I’m trying to set up a CoreOS install to run some rootless podman containers and I can’t for the life of me get it working. As far as I can tell I’ve done everything correctly but I keep getting mysterious permissions issues which makes me think I’m missing something somehow.
I’ve set up 2 users using butane/ignition - the default core user and a second non-wheel user for running the containers. Linger is enabled for the second user. I’ve enabled podman-auto-update by linking it to the user’s .config/systemd/user/default.target.wants directory.
If I try to run podman ps as the container user I get:
Failed to obtain podman configuration: mkdir /run/user/0/libpod: permission denied
Running the command as root shows no containers running, running it as core shows the same result but with a slew of warnings about cgroupv2 settings.
Attempting to troubleshoot with any systemctl --user command gives:
Failed to connect to user scope bus via local transport: Operation not permitted (consider using --machine=@.host --user to connect to bus of other user)
Or, if running as core:
Failed to connect to user scope bus via local transport: $DBUS_SESSION_BUS_ADDRESS and $XDG_RUNTIME_DIR not defined (consider using --machine=@.host --user to connect to bus of other user)
What is going on here? I thought the entire point of CoreOS was that podman and quadlets should just work but it seems even systemd isn’t working properly…
Strictly speaking no, I’m moving from Docker, but I don’t think the containers are the issue. For testing purposes I removed all the containers and set up the example hello world one from the docs (Running Containers :: Fedora Docs - the quadlet example, tried both multi-user.target and default.target, neither work). I don’t think the issue is just with quadlet though because I can’t even run user mode systemd commands. The errors about the runtime environment aand permissions seem to be related to the runtime variables not being set correctly but I have no idea why that would be the case when it’s a pre-canned immutable OS that’s set up declaratively, nor the correct way to fix it (particularly for the core user since that’s built in)…
Would you mind sharing your entire Butane configuration for hello.service that uses Podman Quadlet so we can try to reproduce the issues in a virtual machine.
So turns out it’s a problem with how I was interacting with the system, bit embarrassed I didn’t think to do this before. The butane file I’m using is a bit big because it’s got a fair bit of config for another service in it (which was working) and I wanted to trim it down to focus on the issue, and while doing that I realised the issue was that I was using su to access the linger user (I’d assumed, wrongly, that this would work as the docs don’t mention setting access credentials for the user in the example here: Launching a user-level systemd unit on boot :: Fedora Docs). And running systemd commands from root or core doesn’t show anything either way because they don’t list other users’ services. Adding an ssh key directly to the other user and logging in showed it was actually successfully running hello world.
What didn’t change though was I still get an $XDG_RUNTIME_DIR not set error - I can get around this by setting it manually but shouldn’t this be set automatically? The snippet of the butane file defining the user is just:
su is legacy stuff that does not properly set the environment when changing users. You should use at least sudo -u <user> foo to run basic commands or machinectl or get a proper login shell.
When trying to use systemctl --user commands: Failed to connect to user scope bus via local transport: $DBUS_SESSION_BUS_ADDRESS and $XDG_RUNTIME_DIR not defined
Playing around some more this specifically happens if I ssh into the second user account, if I ssh into core and then switch over with sudo machinectl shell --uid=sleeper I don’t get the error. I can also manually set the runtime directory and it’ll work after that. At this point I can get everything I need working, I’m mostly just a bit confused why the runtime isn’t being set up properly for sleeper over ssh even though it works for core.