Ethernet disconnects after MQTT call on network

My desktops Ethernet connection to the network stops working after using MQTT on the network. Wifi does work

We have a system to control our window blinds using MQTT, where a wall switch is connected to an Arduino, which sends an MQTT message to our Mosquitto server that runs under Home Assistant. Home Assistant sends back which position the shutter should be in and the Arduino activates the shutter.

When this happens, my Ethernet connection still says it’s connected, but I can’t do anything network related anymore. I used nmcli m to monitor what happens, but there is no entry when this happens. If I turn off the Ethernet and turn it back on, I get the following:
image
These messages repeat every minute or so. Sometimes the Ethernet just works after a certain time, sometimes I can get these messages for hours on end. Rebooting fixes the issue and I can use Ethernet again.

No other device on my network seems to have this problem, so I have no idea how to fix this.

My PC has a Gigabyte X570 Aorus Master motherboard with Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 01)

Here is the MQTT traffic that happens when a shutter is opened using the switch:

Could be a problem with your router.
Is it firmware up to date?
Try rebooting the router.

When posting text information please post as pre-formatted text use the </> button not screen shots.

Thank you for the tip about formatting.
The router is on the latest firmware and has been rebooted several times since the problem has started so I’m not certain it is the cause.

I doubt mqtt is the cause. Suspect its just a trigger, maybe sends big packets?

Check the journal for kernel reports of problems with the network.
I wonder if the MTU size is misconfigured for example.

I’ve had the issue just after booting my pc. This is the log filtered for the word ‘network’ from boot to the problem happening. I don’t see any immediate errors.

This is the configured MTU size for my PC:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
3: wlp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
08:56:09 NetworkManager: <info>  [1726988169.6166] device (wlp6s0): conflict detected for IP address 192.168.1.92 with host MAC
08:54:06 kernel: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:54:06 systemd: NetworkManager-dispatcher.service: Deactivated successfully.
08:53:56 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:53:56 systemd: Started NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service.
08:53:56 NetworkManager: <info>  [1726988036.7507] policy: set  (wlp6s0) as default for IPv4 routing and DNS
08:53:56 NetworkManager: <info>  [1726988036.7507] policy: set  (wlp6s0) as default for IPv4 routing and DNS
08:53:56 NetworkManager: <info>  [1726988036.7506] manager: NetworkManager state is now CONNECTED_SITE
08:53:56 NetworkManager: <info>  [1726988036.7505] manager: NetworkManager state is now CONNECTED_LOCAL
08:53:56 NetworkManager: <info>  [1726988036.7503] manager: NetworkManager state is now CONNECTED_SITE
08:52:46 dleyna-renderer: [Network filtering settings]
08:48:48 kernel: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:48 systemd: NetworkManager-dispatcher.service: Deactivated successfully.
08:48:47 NetworkManager: <info>  [1726987727.3698] agent-manager: agent[cd4d605cc59e40b5,:1.87/org.gnome.Shell.NetworkAgent/1000]: agent registered
08:48:38 systemd: Reached target network-online.target - Network is Online.
08:48:38 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-wait-online comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:38 systemd: Finished NetworkManager-wait-online.service - Network Manager Wait Online.
08:48:38 NetworkManager: <info>  [1726987718.5081] manager: startup complete
08:48:34 cupsd: cupsdCreateProfile(job_id=0, allow_networking=1) = NULL
08:48:34 NetworkManager: <info>  [1726987714.4786] device (p2p-dev-wlp6s0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
08:48:34 systemd: Reached target network.target - Network.
08:48:34 NetworkManager: <info>  [1726987714.4175] device (lo): Activation: successful, device activated.
08:48:34 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:34 systemd: Started NetworkManager.service - Network Manager.
08:48:34 NetworkManager: <info>  [1726987714.4153] device (lo): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
08:48:33 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:33 systemd: Started NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service.
08:48:33 NetworkManager: <info>  [1726987713.8400] device (enp7s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
08:48:33 systemd: Starting NetworkManager.service - Network Manager...
08:48:33 avahi-daemon: Network interface enumeration completed.
08:48:33 systemd: Listening on virtnetworkd.socket - libvirt network daemon socket.
08:48:31 kernel: SELinux:  policy capability always_check_network=0
08:48:26 kernel: drop_monitor: Initializing network drop monitor service

Look for dhcp messages.
Also try disabling each interface so you have only one enabled at a time.
Does the problem follow an interface?

The problem only happens on enp7s0, the wired connection. In the past I only had that one connected, so when this problem happened I just wouldn’t have internet. Later I added the antennas for wireless and now I switch to that when the wired connection fails. If I disable the wireless connection, the problem still happens.

I’ve included logs filtered for dhcp and NetworkManager

09:11:20 NetworkManager: <info>  [1726989080.5814] dhcp4 (enp7s0): state changed no lease
09:11:20 NetworkManager: <info>  [1726989080.5814] dhcp4 (enp7s0): state changed no lease
09:11:20 NetworkManager: <info>  [1726989080.5814] dhcp4 (enp7s0): activation: beginning transaction (timeout in 45 seconds)
09:11:20 NetworkManager: <info>  [1726989080.5813] dhcp4 (enp7s0): canceled DHCP transaction
08:48:47 blueman-applet: blueman-applet 08.48.47 WARNING  PluginManager:150 __load_plugin: Not loading DhcpClient because its conflict has higher priority
08:48:38 NetworkManager: <info>  [1726987718.4846] dhcp4 (wlp6s0): state changed new lease, address=192.168.1.92
08:48:38 NetworkManager: <info>  [1726987718.4846] dhcp4 (wlp6s0): state changed new lease, address=192.168.1.92
08:48:38 NetworkManager: <info>  [1726987718.3461] dhcp4 (wlp6s0): state changed new lease, address=192.168.1.92, acd pending
08:48:38 NetworkManager: <info>  [1726987718.2975] dhcp4 (wlp6s0): activation: beginning transaction (timeout in 45 seconds)
08:48:36 NetworkManager: <info>  [1726987716.6286] dhcp4 (enp7s0): state changed new lease, address=192.168.1.27
08:48:36 NetworkManager: <info>  [1726987716.5016] dhcp4 (enp7s0): state changed new lease, address=192.168.1.27, acd pending
08:48:36 NetworkManager: <info>  [1726987716.4990] dhcp4 (enp7s0): activation: beginning transaction (timeout in 45 seconds)
08:48:33 NetworkManager: <info>  [1726987713.8379] dhcp: init: Using DHCP client 'internal'
09:11:20 NetworkManager: <info>  [1726989080.5564] audit: op="device-disconnect" interface="enp7s0" ifindex=2 pid=2897 uid=1000 result="success"
08:54:06 kernel: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:54:06 systemd: NetworkManager-dispatcher.service: Deactivated successfully.
08:53:56 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:53:56 systemd: Started NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service.
08:53:56 NetworkManager: <info>  [1726988036.7507] policy: set 'VDWL51_5G 1' (wlp6s0) as default for IPv4 routing and DNS
08:48:48 kernel: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:48 systemd: NetworkManager-dispatcher.service: Deactivated successfully.
08:48:47 NetworkManager: <info>  [1726987727.3698] agent-manager: agent[cd4d605cc59e40b5,:1.87/org.gnome.Shell.NetworkAgent/1000]: agent registered
08:48:38 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-wait-online comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:38 systemd: Finished NetworkManager-wait-online.service - Network Manager Wait Online.
08:48:38 NetworkManager: <info>  [1726987718.5081] manager: startup complete
08:48:34 systemd: Starting NetworkManager-wait-online.service - Network Manager Wait Online...
08:48:34 NetworkManager: <info>  [1726987714.4175] device (lo): Activation: successful, device activated.
08:48:34 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:34 systemd: Started NetworkManager.service - Network Manager.
08:48:34 NetworkManager: <info>  [1726987714.4153] device (lo): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'external')
08:48:33 kernel: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
08:48:33 systemd: Started NetworkManager-dispatcher.service - Network Manager Script Dispatcher Service.
08:48:33 NetworkManager: <info>  [1726987713.8400] device (enp7s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
08:48:33 systemd: Starting NetworkManager.service - Network Manager...

Odd that it loses the lease on the IP address.
NetworkManager should be renewing the lease I’d expect.

Yeah, I have no clue why this keeps happening. It happened less last week, but now it keeps disconnecting again while nothing changed in our network. IP lease on my router is set to the default of one day.
Do you have any idea what else I could try to fix this?

Make sure that the date and time are correct on router and your systems.
Also that you have network time setup to keep it correct.
Maybe the lease is dropped becuase of an issue with time keeping?

Date and time are correct on both the system and the router. Router is using pool.ntp.org as ntp server and my system is using 2.fedora.pool.ntp.org with chronyd. When comparing their clocks, there’s about a 5 second difference that I can see. Would it be worth trying to change them to the same ntp pool?
I also looked into the logs of my router and there’s nothing unusual happening around the time my connection drops.

Are you sure that you have the units right for the time?
This is what I see as an example:

$ chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^- ntp1.as200552.net             2  10   377   397  -5079us[-5079us] +/-   55ms
^- y.ns.gin.ntt.net              2   9   377   275  +1528us[+1528us] +/-   99ms
^* 183.ip-51-89-151.eu           2  10   377   617  -4182us[-4691us] +/-   14ms
^+ devrandom.pl                  2  10   377   484  +1644us[+1644us] +/-   17ms

That is 5ms not 5s difference.

If you confirm 5 seconds then something is wrong with the NTP sync.

But that is not enough to break DHCP leases.

This may be a similar problem and thus should be reported upstream:

For my pc I can see this one:

chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^- mail.rondie.nl                2   9   377   452    +77us[  +77us] +/-  914ms
^* leontp3.office.panq.nl        1   9   377   454   -386us[ -502us] +/- 3833us
^+ nts1.time.nl                  1  10   377   802   +233us[ +128us] +/- 5643us
^- ntp1.mediamatic.nl            2  10   377   598   -239us[ -350us] +/-   31ms

For my router I can only go off the clock it shows on it’s webpage