Undoing of NIC bond/team now can't ping/ssh out of F34 server

I followed this tutorial to attempt to enable NIC teaming/binding. Running Fedora 34. After running nmcli con up bond0 the server list Internet connection and could not ping out (neither by IP nor hostname). So I set about to undo it. I went ahead and used the nmcli commands to delete the bond as well as the slaves. Then I simply configured 1 of the 2 1 GB NIC cards with its usual IP address. And still, no ping, replies with Destination Host Unreachable. And ssh gets No route to host.

arp -n has HWaddress 150.x.x.x as (incomplete) as well as the gateway 150.x.x.y as (incomplete)

ip neigh
150.x.x.x dev eno3 FAILED
150.x.x.y dev eno3 FAILED

Based on some Red Hat troubleshooting suggestions,
I only received an ARP Request but don’t see an incoming ARP Reply.

tcpdump -n -i eno3 arp and host 150.x.x.x or 150.x.x.y
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eno3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:45:46.140849 ARP, Request who has 150.x.x.x (Broadcast) tell 150.y.y.y., length 28
14:45:53.962633 ARP, Request who has 150.x.x.y (Broadcast) tell 150.y.y.y, length 28
T

I deleted the profiles in the NM GUI and recreated them. Configured them as usual. Nothing. Rebooted. Deleted them from nmcli, rebooted, recreated, still no ping. Traceroute fails with 30 hops max.

Here’s an obfuscated screenshot of NM GUI.

What else can I try? I’m a bit desperate here…

I even tested switching from NM to systemd-networkd and the state from networkctl -n 0 status shows routable, but still no ping nor ssh out by IP address. Rebooted again too.

And based on another suggestion, I tried:

tcpdump -n eno3 'host [150.x](https://150.108.64.1).x.x. and icmp' dropped privy to tcpdump tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on eno3, link-type EN1OMB (Ethernet), snapshot length 262144 bytes

And when I ran ping to the IP nothing happens. firewalld and fail2ban are off for now and I ran iptables -F.

  1. Disable all networking, teaming, bonding, firewalls, virtualization, iptables, nftables, fail2ban and etc. services:
sudo systemctl disable \
NetworkManager.service \
systemd-networkd.service \
libvirtd.service \
firewalld.service \
fail2ban.service \
...
  1. Reboot the host and the router it is connected to.
  2. Verify you have only the loopback and physical interfaces, no other virtual interfaces, no extra IPs, no neighbors, no routes, only essential routing rules, empty iptables, nftables and IP sets:
# ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:54:00:c3:72:45 brd ff:ff:ff:ff:ff:ff

# ip neigh show

# ip route show table all
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
::1 dev lo proto kernel metric 256 pref medium
local ::1 dev lo table local proto kernel metric 0 pref medium

# ip rule show
0:	from all lookup local
32766:	from all lookup main
32767:	from all lookup default

# ip -6 rule show
0:	from all lookup local
32766:	from all lookup main

# sudo iptables-save

# sudo ip6tables-save

# sudo nft list ruleset

# sudo ipset list
  1. Set up your network interface statically with IPv4 and test IPv4 connectivity:
sudo ip link set enp1s0 up
sudo ip address add 192.168.122.123/24 dev enp1s0
sudo ip route add default via 192.168.122.1

ping 192.168.122.1
ping 8.8.8.8
tracepath 1.1.1.1
  1. If the above works, you can proceed with enabling and troubleshooting NetworkManager.

Thanks so much for the suggestions!

Update. I called Dell support then had me boot to their Support Live Image (SLI) based on CentOS. The problem remained. So the results of the arp command was a clue. Something must happen to the switch when I enabled NIC bonding. I chose:

“4 Dynamic Link Aggregation aggregated NICs act as one NIC which results in a higher throughput, but also provides failover in the case that a NIC fails. Dynamic Link Aggregation requires a switch that supports IEEE 802.3ad”

Is it possible the switch where the server connect to doesn’t support 802.3ad?

Dell suggested draining flea power. I did that but it didn’t help. Then I switched ports on the switch and drained flea power and boom all was well. I then configured bonding again only to cause the same problem. Switching the port on the switch fixed it and for good measure draining flea power.

I’m scheduling some down time to confirm for certain whether is was LACP or just changing the bonding/teaming causes the issue. Perhaps this helps someone else down the line.

1 Like