Problem statement:
Need help how to debug what I think may be a kernel issue with the igb module - on 3 different interfaces, two different NIC hardware (both intel NICs) I get the same issue: The negotiated speed jumps to 100mbps and when resetting the interface back to 1gbps.
Fedora: 33 and 34 prerelease (both affected).
A bit of background because I know everyone will focus on a bad cat6 cable. This is the 4th cable I’m using. It’s the 2nd switch I’m using, it’s the 2nd NIC card I’m using. The only thing that “is the same” is Fedora - the motherboard and the CPU/memory.
Here’s an example from dmesg of what I see:
[ 71.030408] igb 0000:06:00.0 enp6s0f0: igb: enp6s0f0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
[ 141.767864] igb 0000:06:00.0 enp6s0f0: igb: enp6s0f0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 233.180428] igb 0000:06:00.0 enp6s0f0: igb: enp6s0f0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
[ 1452.904227] igb 0000:06:00.0 enp6s0f0: igb: enp6s0f0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 1593.918465] igb 0000:06:00.0 enp6s0f0: igb: enp6s0f0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
It tends to settle on 100Mbps but it can change particular during load which of course is not optimal.
What I see happening is in ethtool the “advertised link modes” is reduced to not include 1Gbps:
[peter@boss ~]$ sudo ethtool enp6s0f0
Settings for enp6s0f0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: 100Mb/s
Duplex: Full
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
MDI-X: off (auto)
Supports Wake-on: pumbg
Wake-on: d
Current message level: 0x00007fff (32767)
drv probe link timer ifdown ifup rx_err tx_err tx_queued intr tx_done rx_status pktdata hw wol
Link detected: yes
But after executing “nmcli c up ” I get these advertised links:
Advertised link modes: 1000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Which stays put for a while and then changes to 100Mbps and the first output. Using ethtool -r does not cause this - only the nmcli c up seems to have a chance of changing the negotiated link.
Hardware wise (from lspci -k):
06:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Subsystem: Intel Corporation Device a02f
Kernel driver in use: igb
Kernel modules: igb
I tried to add a modprobe for igb where I add “debug=16” but this is never activated - in /etc/modprobe.d/net-igb.conf I have:
options igb debug=16
Which according to modinfo should turn on full debugging. However, the module is never loaded with parameters:
ls /sys/module/igb
coresize drivers holders initsize initstate notes refcnt sections taint uevent
(note the missing “parameters” directory).
So I’m running out of ideas on how to debug this. How can I tell what causes the negotiation to change? I’ve used two different nic cards - and the built-in on the motherboard (also igb) and they all provide the same result. Different cat6 wires, different switches. Same result. I may have an old e1000 card somewhere. That’s about my last resor - just avoiding igb.