Systemctl reboot doesn't work

A few upgrades ago the reboot function in Fedora seems to have stopped working. Instead, the system halts but on restart I get a message “no devices found” (more or less). This also happens on automatic updates, I have to powerdown and then restart to get the upgrade to start.

This problem now seems to have migrated to RHEL where it is a much bigger problem since the server is located remotely.

Any suggestions?

Thanks in advance.

Could it be something like BIOS Fast boot? Perhaps something like BIOS ACPI taking the reboot command and trying to “do something” with it (breaking the reboot from Linux’s side), and then on reboot the BIOS fails to find devices because the BIOS tried to re-use something that was supposed to come from the earlier “do something”.

Or if the missing device is with GRUB, it’s likely something with OpROMs, UEFI vs Legacy or CSM, and SATA/NVMe controller?

1 Like

To answer the first part, both this workstation and the server use UEFI, not BIOS. In the past, since the current hardware, everything did work on this workstation, not so sure about the server since I rarely want to reboot it.

The workstation uses all standard hardware with whatever came on the system controllers so I can’t think of anything that might be peculiar. What would you do to diagnose find it? On this workstation the disks are a mix of SSD and mechanical types. (4 total of which 3 are SSD’s). The server used to be all SSD’s but is slowly converting to hard disks because of the high failure rate on the WD SSD’s. (fast and unreliable isn’t an improvement over slow and dependable!)

Any further suggestions? This is all standard hardware with UEFI firmware using grub. Used to work until a few updates back (not sure but I think the last F39 update stopped it working.). At the start of an upgrade you have to manually do a cold boot to get the update running. Does anyone else have the problem or is this unique to my system?

I know it is possible to trigger that behavior by setting noacpi on the kernel command line. But short of that, I haven’t seen this problem.

Apparently systemd’s halt command is supposed to work that way.

Excerpted from man reboot:

Note that on many SysV systems halt used to be synonymous to poweroff, i.e. both commands would equally result in powering the machine off. systemd is more accurate here, and halt results in halting the machine only (leaving power on), and poweroff is required to actually power it off.

Maybe if /etc/systemd/logind.conf were misconfigured somehow, the halt target could be run instead of the reboot target?

Another possibility might be some installed software leaving an inhibitor set? Does systemd-inhibit list anything that looks suspicious?

That’s all I have for ideas off-hand.

Edit: Actually, now that I think about it, I think I might have seen this happen for a short while on one of my Lenovo PCs. But it was something that went away with another update not too long afterward, so I never thought much of it. I don’t remember exactly how long ago it was or what kernel version it might have been.

Thanks. I looked at all the /var/lib/dnf/yumdb/r/ 6aa71b2dff203a6341e240ea263afec6ddb81ccc-rpm-plugin-systemd-inhibit-4.14.2.1-3.fc29-x86_64/* files and nothing odd except that the “releasever” file shows “29” when I’m actually on Fedora 40. I know about the halt/poweroff change; had a server that I had to power down a lot.

I haven’t changed anything to “noacpi” and I don’t know why I would have to do that for anything I am running. The same problem surfaced on the #2 server which is on Rocky 9 which is related to Fedora but not such a nuisance as it rarely gets rebooted; it’s the workstation but the problem isn’t existential, just frustrating so I’ll forget about it for now. The initial load before many updates was about Fedora 20 or so on this machine so lots of updates.

That doesn’t indicate anything. Run rpm -q rpm-plugin-systemd-inhibit to check which version is actually installed.

Blockquote
rpm -q rpm-plugin-systemd-inhibit
rpm-plugin-systemd-inhibit-4.19.1.1-1.fc40.x86_64

What does this really tell me? Is there something I should change to allow systemctl reboot to work properly? This is whatever was installed by default as I haven't consciously made any changes.

I did some research on the systemd-inhibit function and --list gives:

ModemManager 0 root 971 ModemManager sleep ModemManager needs to reset devices delay
NetworkManager 0 root 986 NetworkManager sleep NetworkManager needs to turn off networks delay
UPower 0 root 1407 upowerd sleep Pause device polling delay
GNOME Shell 1000 John 2129 gnome-shell sleep GNOME needs to lock the screen delay
John 1000 John 2342 gsd-media-keys handle-power-key:handle-suspend-key:handle-hibernate-key GNOME handling keypresses block
John 1000 John 2342 gsd-media-keys sleep GNOME handling keypresses delay
John 1000 John 2348 gsd-power sleep GNOME needs to lock the screen delay

None of these seem to involve reboot and except for Gnome only cause a delay so I would expect the reboot to occur eventually. The Gnome function is block but refers to key presses which are already unavailable since the system does the shutdown prior to the reboot.

What am I missing here?

Your systemd-inhibit output looks similar to mine. That probably isn’t the cause.

My guess is it is something in the kernel that has changed. I’ve seen many indications that the kernel devs have been working on getting power management to be more aggressive.[1] The problem was probably introduced by a kernel update and (hopefully) it will go away with another kernel update in the not-to-distant future.

Here is a link to open bug reports related to power management on kernel.org: Bug List. You might try skimming those to see if any look similar to your situation.


  1. https://lwn.net/Archives/ConferenceIndex/#OS-Directed_Power-Management_Summit ↩︎

1 Like

Thanks Gregory. They all seem to have a similar flavour but my problem is that the reboot just doesn’t happen. “systemctl reboot” halts the system and then the bios process happens but the actual boot doesn’t. Sometimes I get a message about “no devices found” but not always. To actually boot I need to press and hold the power off switch; reset doesn’t do it. I waited quite a while to report this because Ithought an update might fix it but no luck.

What you are describing sounds very similar to this bug report in particular:

https://bugzilla.kernel.org/show_bug.cgi?id=216629

If it is in the kernel (which appears likely to me), your options are to either try to find a kernel (or module) parameter that will disable/bypass the problematic code or you could try to downgrade your kernel to an earlier one that did not exhibit the problem. (Or you could file another bug report on kernel.org and try to work with someone there to get the problem fixed.)

Downgrading your kernel might be an easy way to at least verify that that is where the problem lies. You can install older kernels with commands along the lines of dnf downgrade --repo=fedora --releasever=39 kernel*. You will have to manually select the older kernel from the boot menu when you power on your computer. Use uname -r to verify that your system running on the older kernel.

Thanks Gregory. Similar but not exactly. Power off works as expected do it daily every evening; reboot doesn’t work, more like I issued a “systemctl halt” command.

I’m not at all confident that I have the skill set to deal with downgrading the kernel as you suggest and the problem with my workstation can be bypassed by pushing a button so I’m going to avoid that and hope for the best. The real incentive to get this fixed is rather arcane - I am also running a Rocky (yes, not Fedora but related) server that has the same problem and is remote, so if it gets fixed in Fedora then probably in Rocky too and probably good for all RHEL 9 users.

Stephen: your answers:

  • no changes except regular automatic updates since Fedora 26 or so
  • Yes, 2 hard disks WD Blue 1 Gb
  • No RAID, uefi boot
  • Not sure of the difference; I made no changes to the firmware (deliberately at least)
  • shutdown appears normal and I usually watch the screen during shutdown both normally (daily poweroff) and for reboot for the automatic updates. There are no apparent error messages being issued.
2 Likes

There is nothing common between the machines. This one (Fedora 40) is a recycled server using an ASUS motherboard. It was new about 2016. The server is a brand new build using an MSI motherboard. This machine runs Fedora workstation O/S and has no RAID storage of any kind.

smbios-token-ctl (running as root) crashed with the following error:

Traceback (most recent call last):
  File "/usr/lib64/python3.12/site-packages/libsmbios_c/smbios_token.py", line 134, in __iter__
    raise StopIteration
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/sbin/smbios-token-ctl", line 475, in <module>
    sys.exit( main() )
              ^^^^^^
  File "/usr/sbin/smbios-token-ctl", line 380, in main
    dumpTokens(tokenTable, tokenXlator, options)
  File "/usr/sbin/smbios-token-ctl", line 214, in dumpTokens
    for token in tokenTable:
                 ^^^^^^^^^^
RuntimeError: generator raised StopIteration

I haven’t followed troubleshooting up until this, but if you need temporary access to that tool and expect it to work, I’d try from an older kernel, specifically openSUSE’s Leap 15.6 GNOME from a LiveUSB.


Late F40 and today F41 I get an invalid call with smbios tools that I’m pretty sure is because of an update somewhere. oS’s 15.6 Leap image is seemingly old enough to not be affected (while still being modern; it’s kernel 6.6 iirc).

I use smbios-thermal-ctl to set performance fan profile. In-lieu of further trying to figure out what Invalid call 17/19 means, I just boot that LiveUSB real quick to do it :stuck_out_tongue:

Thanks for trying Stephen. Given that this is really not a serious issue, more of a nuisance, I think I’ll leave it here as I seem to be wasting a lot of people’s time. Let’s hope the Fedora developers get around to fixing it in the future.

Added firmware, hardware, systemd

I think anybody finding a solution is very unlikely. After reading the thread, I don’t even know which Fedora version you’re using, not even speaking about specific kernel and systemd versions, how your machine is set up, or what the actual error message is.

IIUC, after a reboot, the new kernel doesn’t see devices. This sounds like failing hardware and/or some firmware or driver issue. It’s most likely very specific to your system.

I am using the current updated version of Fedora 40 workstation. The problem first appeared when I upgraded from F 39. Probably NOT the hardware since a server running Rocky 9.4 which is closely related to Fedora displays the same problem (systemctl reboot hangs and you have to power off and back on again to get it to boot). The server uses different hardware completely than the workstation (server MSI Pro550 , workstation ASUS). On my workstation reboot fails with the same message on systemctl reboot AND on the reboot when an automatic system upgrade occurs and the same process is required power-down and up again. Boot works fine. Both have UEFI bios software.