Postfix fails to start at boot if configured to speak to the network

I installed Postfix and configured it to be a network server, not just listen on the loopback network (“localhost”)

$ git diff main.cf
diff --git a/postfix/main.cf b/postfix/main.cf
index bd0aea8..fb28e96 100644
--- a/postfix/main.cf
+++ b/postfix/main.cf
@@ -131,8 +131,8 @@ mail_owner = postfix
 #
 #inet_interfaces = all
 #inet_interfaces = $myhostname
-#inet_interfaces = $myhostname, localhost
-inet_interfaces = localhost
+inet_interfaces = $myhostname, localhost
+#inet_interfaces = localhost

When that is done, Postfix crashes at start up.

Bug reports about this have been opened and not fixed since 2014.

Reported 2014-07-06
https://bugzilla.redhat.com/show_bug.cgi?id=1116538

Reported 2017-11-06
https://bugzilla.redhat.com/show_bug.cgi?id=1510158

Reported 2021-12-10
https://bugzilla.redhat.com/show_bug.cgi?id=2031189

Ultimately this issue all comes down to this:

# systemd unit file comes from the postfix RPM.
$ rpm -qf /usr/lib/systemd/system/postfix.service
postfix-3.6.4-1.fc36.x86_64
# File has not been modified.
$ rpm -qV postfix-3.6.4-1.fc36.x86_64 | grep -c /usr/lib/systemd/system/postfix.service
0
# Section of the unit file with the issue
$ grep After= /usr/lib/systemd/system/postfix.service
After=syslog.target network.target

Because of that, Postfix is started when network.target is reached but network.target is reached before the LAN/WAN network interface becomes ready so Postfix crashes due to no LAN/WAN network interface being available to listen on.

The work-around is to create a systemd override changing the After= to be one that is reached after the LAN/WAN network interface has been prepared. The package maintainer has claimed in the bug reports that doing so will damage the system but many people resort to doing it and do not have damaged systems.

I don’t understand the package maintainer process process. Postfix is not a trivial package, Postfix was created at IBM and IBM owns Red Hat. If you google for most popular MTA you will see it Yet this Postfix systemd configuration issue has been reported since 2014, about 8 years. Is it that the package maintainer does not actually use the software they are maintaining?

One more thing, Debian gets it right.

root@debian:~# grep After= /lib/systemd/system/postfix@.service
After=network-online.target nss-lookup.target

I would think that justifies a new bug report with the suggested fix what you posted as the result from the debian system. It clearly would cause a problem in those systems where timing of the interface activation was delayed beyond the activation of postfix and that change would prevent activating postfix too soon.

root@debian:~# grep After= /lib/systemd/system/postfix@.service
After=network-online.target nss-lookup.target

Some systems may activate the interface fully soon enough but others may not with that reading network.target instead of network-online.target. (I’m thinking the differences between ethernet and wifi specifically, but not exclusively.) Having both of those services active verifies the network is fully activated and that the internet is available.

I haven’t tested the Debian version on Fedora yet. I am in the process of updating a KVM/Qemu/Libvirt Fedora 36 VM and adding a “before-postfix” snapshot to it. When that is done I can install and configure Postfix.

One more thing is that duplicate bug reports get closed as “duplicate” so creating a duplicate (“new”) bug report is not the right thing to do.

Is this fedora workstation or silverblue? The approach would likely be different depending?

Also, the VM would be limited to ethernet and no wifi (unless you had a dedicated wifi adapter you could pass thru). Timing of activation of the network on the VM would likely be different than directly on the hardware.

I know it’s Workstation, it may also be on Server. I don’t have time to test on WiFi, could you please do that?

In order to test on Wifi I would have to:

  1. boot up my mini-server which runs Ubuntu (it’s my only Ubuntu on bare metal)
  2. provision or re-use a Fedora VM
  3. get the wifi to pass-through to the VM

In general I don’t use wifi because it’s too slow for me.

Postfix started correctly at boot with

$ grep After= /usr/lib/systemd/system/postfix.service
After=network-online.target nss-lookup.target

I think by the time nss-lookup.target is reached the WAN/LAN network interface has to be up regardless of WiFi.

Possibly, but it does not require it.

# grep -i after= /usr/lib/systemd/system/nss-lookup.target 
[root@eagle ~]#

But the network-online.target is after the network.target so it definitely is after activating the interfaces.

# grep -i after= /usr/lib/systemd/system/network-online.target 
After=network.target

The man page for systemd.special (identified within /usr/lib/systemd/system/network-online.target) says this:

       network-online.target
           Units that strictly require a configured network connection should pull in network-online.target (via a Wants= type dependency) and
           order themselves after it. This target unit is intended to pull in a service that delays further execution until the network is
           sufficiently set up. What precisely this requires is left to the implementation of the network managing service.

           Note the distinction between this unit and network.target. This unit is an active unit (i.e. pulled in by the consumer rather than the
           provider of this functionality) and pulls in a service which possibly adds substantial delays to further execution. In contrast,
           network.target is a passive unit (i.e. pulled in by the provider of the functionality, rather than the consumer) that usually does not
           delay execution much. Usually, network.target is part of the boot of most systems, while network-online.target is not, except when at
           least one unit requires it. Also see Running Services After the Network is up[1] for more information.

which explains the need for this service to be active before things that require the network.

If you do not wish to file a new bug then posting your findings and this text from man:systemd.special on the currently open bug.

It would appear that originator of that bug has already posted a comment about your fix there today. If you were to add your findings it would contribute to encouraging the change needed. Especially with the text from man systemd.special

I do suspect though that the line should read either
After=syslog.target network-online.target
or
After=syslog.target network-online.target nss-online.target
The current default line in the postfix.service file is After=syslog.target network.target