Resolution
The issue is due to a slowness accessing the initramfs being restored in /run/initramfs
during shutdown by the dracut-shutdown
one-shot service.
Case where the issue happens at installation time
The root cause is having poor bandwidth between the DVD image location and the hardware’s Management Console, which prevents dracut-shutdown
from extracting the initramfs before timing out after 90 seconds.
Solution 1 - Use a jump host to mount the DVD image from a closer place
Ideally, DVD images should be mounted from a system which is close from the hardware’s Management Console, which will speed up the installation and prevent the issue to occur during reboot.
Solution 2 - Let the service run for a longer time
In case Solution 1 is not a possibility, edit the kickstart being used to install the system to alter the behavior of dracut-shutdown
service during shutdown, allowing it to take as long as it needs.
This is done using a %post script running outside the chroot, as shown in the example below:
%post --nochroot
mkdir -p /etc/systemd/system/dracut-shutdown.service.d
cat > /etc/systemd/system/dracut-shutdown.service.d/stop-timeout.conf << EOF
[Service]
TimeoutStopSec=0
EOF
systemctl daemon-reload
%end
Case where the issue happens during a regular reboot
Ensure that the dracut package is at version dracut-049-209.git20220815.el8
or a later release.
If the issue happens with the recent dracut
package, proceed further:-
The root cause is likely due to having an issue accessing the hard disk where /boot
is hosted, but so far we never had customers report this except once.
At the time of writing, there was no further investigation being done yet. If this happens, please open a case on the Customer Portal referring to this Solution.
The solution consists in making sure no switch root to /run/initramfs
happens if there was a failure during dracut-shutdown
service execution.
This is done by implementing drop-ins and services, as shown below.
- Create a new
dracut-shutdown-onfailure.service
unit as /etc/systemd/system/dracut-shutdown-onfailure.service
[Unit]
Description=Service executing upon dracut-shutdown failure to perform cleanup
DefaultDependencies=no
[Service]
Type=oneshot
ExecStart=/bin/sh -c '/bin/rm /run/initramfs/shutdown 2>/dev/null || true'
This service will be responsible for cleanin up the extracted initramfs upon dracut-shutdown
service failure.
2. Create a drop-in to dracut-shutdown.service
unit as /etc/systemd/system/dracut-shutdown.service.d/on-failure.conf
[Unit]
OnFailure=dracut-shutdown-onfailure.service
- Create a drop-in to
plymouth-switch-root-initramfs.service
unit as /etc/systemd/system/plymouth-switch-root-initramfs.service.d/after.conf
[Unit]
After=dracut-shutdown-onfailure.service
ConditionPathExists=/run/initramfs/shutdown
- Reload
systemd
for changes to take effect
# systemctl daemon-reload
For convenience, the following commands block can be copy/pasted to create the various new files listed above:
mkdir -p /etc/systemd/system/dracut-shutdown.service.d /etc/systemd/system/plymouth-switch-root-initramfs.service.d
cat > /etc/systemd/system/dracut-shutdown-onfailure.service << EOF
[Unit]
Description=Service executing upon dracut-shutdown failure to perform cleanup
DefaultDependencies=no
[Service]
Type=oneshot
ExecStart=/bin/sh -c '/bin/rm /run/initramfs/shutdown 2>/dev/null || true'
EOF
cat > /etc/systemd/system/dracut-shutdown.service.d/on-failure.conf << EOF
[Unit]
OnFailure=dracut-shutdown-onfailure.service
EOF
cat > /etc/systemd/system/plymouth-switch-root-initramfs.service.d/after.conf << EOF
[Unit]
After=dracut-shutdown-onfailure.service
ConditionPathExists=/run/initramfs/shutdown
EOF
systemctl daemon-reload
Root Cause
- During shutdown, the
dracut-shutdown
one-shot service executes to unpack the initramfs used during boot to /run/initramfs
- Then a switch root to
/run/initramfs
happens to be able to unmount completely the remaining system mounts (typically the root file system)
- The
dracut-shutdown
service has up to 90 seconds to do this task, then it’s getting killed by systemd due to time out
- If, for some reason, the service took more than 90 seconds to complete or a failure happened, this leads to switching root into a broken root file system tree, which misses critical binaries such as
/sbin/reboot
which is extracted as last items from the initramfs.
RFE 1924587 - RFE: Harden the shutdown phase to avoid dropping into the emergency prompt has been filed to harden the shutdown process.