Wow. Interesting, tedious, fiddly details ahead, but I now kind of know what’s happening.
First of all: nothing’s broken. There’s no disk or filesystem problem. It’s purely configuration choices interacting in timing-dependent ways.
On boot, systemd-fstab-generator
creates various *.mount
unit files corresponding to /etc/fstab
entries and places them in /run/systemd/generator
. However, you can also copy, say, /run/systemd/generator/boot.mount
to /etc/systemd/system/boot.mount
, and systemd will prioritize use of this static copy over the generated one. Then you can customize it by adding environment variables that mount
uses to boost its logging. I added the “Environment=” line and the [Install]
sections below to my /etc/systemd/system/boot.mount
file:
...
[Mount]
Environment=LIBMOUNT_DEBUG=all
Where=/boot
What=/dev/sda1
Type=ext4
Options=defaults,rw
[Install]
WantedBy=multi-user.target
With no other changes, this reproduces the problem but adds significant debugging information from the mount process to the system journal. Crucially, it shows the actual error from the initial attempt to mount /dev/sda1
read-write. It isn’t that the device is write-protected as the later message claims, but rather that the device is busy. Something is already using that filesystem, and that something is LUKS. Or rather systemd-cryptsetup@.service, or maybe run-systemd-cryptsetup-keydev*.mount. That part isn’t exactly clear, and it’s so far down in the weeds I’m not sure I care. But here’s what’s going on.
Except for the /boot
partition, all the others on this machine are encrypted. There are three LVM volumes in a LUKS encrypted partition on /dev/sda
, and one more on /dev/sdb
. There are several ways for LUKS to get a password, and if they fail, the universal fallback is to prompt for it at the console. But suppose that machine is not readily available via a console — like in a lights-out server room, attic, or basement across town. It’s possible to configure your /etc/crypttab
so that it will attempt first to read the password from a specific file in /boot
(the only un-encrypted partition available). You then create the file when you need to, then reboot remotely. The system boots into its initrd, finds the password, decrypts the other partition(s), completes booting, and then you login and delete that file. If you reboot and that file doesn’t exist, then the fallback of prompting at a console is back in play. (You can also configure a special ssh session with keys that allows you to remote in to the partially booted system and enter a LUKS password that way, but that’s out of scope for our purposes.)
The catch is, while system services are still getting the various LUKS partitions unlocked and checking the filesystems therein, the device/partition containing that password is in use, and you can’t go messing with that mount. Basically, it’s already been mounted by the time boot.mount
is invoked. You could do a -o remount,rw
, but a straight-up mount will fail. If you add a few silly services like I did up-thread, then you change the timing of things such that LUKS and its systemd friends are done (maybe), control of /dev/sda1
(maybe) is relinquished by the time boot.mount
runs, and the mount (sometimes) works.
Maybe if you’re going so far down the rabbit hole as to have your encrypted system sometimes able to reboot with a temporary LUKS password in a temporary file in /boot
, then adding a silly “Remount /boot as read-write” service isn’t so much farther down the rabbit hole after all.