Raspberry Pi and Enabling WiFi

Hi,

Disclaimer: entirely new coreos user with a lot of reading to do :slight_smile:

I’m currently experimenting with coreos on the raspberry pi 4 and 400 but am struggling a little bit with the wifi side of things.

In terms of what I’ve done:

  1. disk provisioned with coreos-installer from a pi 3 and a basic ignition that added a user with ssh-key into the sudo group
  2. booted on ethernet and sudo rpm-ostree install NetworkManager-wifi NetworkManager-wwan wpa_supplicant wireless-regdb per layering-examples
  3. nmtui used to configure static IPv4 addresses for eth0 and wlan0, disable IPv6, and Automatically connect and Available to all users.

nmcli device shows both interfaces up and connected:

DEVICE         TYPE      STATE         CONNECTION 
eth0           ethernet  connected     eth0       
wlan0          wifi      connected     wlan0      
p2p-dev-wlan0  wifi-p2p  disconnected  --         
lo             loopback  unmanaged     --

I can see these connections on the router.

However if I systemctl reboot and remove the ethernet capble, or simply remove the ethernet cable without rebooting, I lose all connectivity and the device disappears from the router. If I then plug the ethernet capable back in it comes back online after a couple of minutes.

Can anyone suggest next steps to help me debug this?

Thanks!

Welcome Adam,

What IP address does the wifi connection have? Are you sure you’re connected to the machine over the wifi connection and not the ethernet connection when you do the “simply remove the ethernet cable without rebooting” test?

Thanks for the reply.

That’s an interesting thing actually.

I have the ip’s defined as

eth0 192.168.1.10
wlan0 192.168.2.20

When looking in the router with the ethernet plugged in though I only see 192.168.2.20 connected, so I think you’re right although this has me a bit more confused!

Thanks

Hmm. So did you figure out how to make a connection over the wifi connection only?

As long as the ethernet is plugged in I can connect via either the ethernet ip or the wifi ip, as soon as I unplug the ethernet I can’t connect by either. Plug it back in and both start responding :frowning:

Righto, re-provisioned the device this evening just in case I’d fiddled something. Butane config just added an ssh key, and I did sudo rpm-ostree install NetworkManager-wifi.

I’ve noticed that while eth0 retains its mac address, wlan0 changes each time it is brought up (ip link set dev wlan0 up) and when it is activated and deactivated via nmtui. I’m not sure if these is relevant, but I wasn’t expecting it.

Even when I manually bring wlan0 up and activate it, confirm the ip can be pinged/ssh’d from another device, as soon as I unplug the ethernet it all drops again.

I’m not sure what else to check here, it is very strange. There is some stuff on the rpi forums about ifplugd but that isn’t present as far as I can tell. The only other info I could find was it wifi disconnecting if it was being switched into power-saving mode, but I can’t see how I could be doing that either.

Any ideas greatly appreciated.

Noting that you have wifi (wlan0) on 192.168.2.20 and ethernet (eth0) on 192.168.1.10 I have to wonder what the router is seeing for networks and routing. It seems quite possible that replies are only being received by the ethernet connection, even if sent out by wifi.

Those are 2 different subnets if your router is using the /24 (or /23) netmask and only in the same subnet if your router is using the /22 netmask (or smaller).

In the past I have seen similar issues even when the host had 2 interfaces on the same subnet because of routing.

I’ve tweaked it yet again, trying to get this as simple as possible.

The router is now using a subnet of 255.255.255.0 (previously it was 255.255.0.0).

I’ve configured eth0 to have a static address of 192.168.1.10/24 and wlan0 to have a static address of 192.168.1.20/24.

I have also ran nmcli connection modify wlan0 802-11-wireless.mac-address-randomization 1 to stop the randomising of the mac whilst testing, and likewise nmcli connection modify wlan0 802-11-wireless.powersave 2 to disable any powersaving. Hopefully I can revert these two settings once I bottom out this remaining issue.

I’ve added a /etc/NetworkManager/dispatcher.d script thus:

#!/bin/bash

function enable_disable_wifi() {
    result=$(nmcli dev | grep "ethernet" | grep -w "connected")

    if [ -n "$result" ]; then
        nmcli radio wifi off
    else
        nmcli radio wifi on

ping -c 3 -4 -R 8.8.8.8 > /var/home/adam/ping.txt
url https://ifconfig.co/json > /var/home/adam/ifconfig.txt
    fi
}

if [ "$2" = "up" ]; then
    enable_disable_wifi
fi

if [ "$2" = "down" ]; then
    enable_disable_wifi
fi

Ref: Example 15 in nmcli-examples: NetworkManager Reference Manual

The idea here being if ethernet is plugged in wifi is turned off entirely, and if ethernet is removed wifi is turned on.

With ethernet plugged in I can successfully ssh to 192.168.1.10. When I remove ethernet the dispatcher script runs:

In ping.txt I can see:

PING 8.8.8.8 (8.8.8.8) 56(124) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=119 time=21.5 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=119 time=365 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=119 time=599 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 21.542/328.430/598.538/236.989 ms

And in ifconfig.txt:

{
  "ip": "<redact>",
  "ip_decimal": <redact>,
  "country": "United Kingdom",
  "country_iso": "GB",
  "country_eu": true,
  "latitude": <redact>,
  "longitude": <redact>,
  "time_zone": "Europe/London",
  "asn": "<redact>",
  "asn_org": "<redact>",
  "hostname": "<redact>",
  "user_agent": {
    "product": "curl",
    "version": "7.85.0",
    "raw_value": "curl/7.85.0"
  }

So it seems like it has definitely connected to the wifi, has internet access, and is correctly doing dns resolution.

However if I use Fing on my phone to scan the network I can’t see 192.168.1.20 connected to the router, nor can I see it when I log into the router directly (I presume this latter is just the router software not updating efficiently).

Similarly if I ping 192.168.1.20 from my mac I get:

% ping 192.168.1.20
PING 192.168.1.20 (192.168.1.20): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
^C
--- 192.168.1.20 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

And equally a nil response if trying to ssh.

So I feel that I’m almost there with this there is just this last little niggle of other devices being able to pick up the change.

One last little niggle to check.
Is the wifi in the proper zone of your firewall? You can look at that with the gui firewall-config or from the command line with firewall-cmd.

You may also need to enable ports or services in the firewall.
To see what is open on the wifi you could use nmap 192.168.1.20 and for the ethernet nmap 192.168.1.10

The fact that you can ping 8.8.8.8 is encouraging and shows that at least the router sees your host and allows the reply back in.

This is strange, though it is possible that since the address is statically assigned on the host the router may not even know of it until the host makes an outbound connection request…

Is it possible those 2 addresses are within the dhcp address range the router assigns? If so then there could be conflict, so it is best to assign static addresses that are outside the dhcp assigned range.

For static addresses I use the dhcp reserved address function on the router so the router always assigns the same address to the same mac address on my network. That allows me to always configure my hosts for dhcp and still have static addresses.

Some interesting thoughts there.

I haven’t got firewall-cmd available (this is still a basic coreos-intstall with NetworkManager-wifi overlay). Looking at iptables -L gives

-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT

I’m not too familiar with iptables usage directly but as I understand it those rules allow all traffic in and out without any restriction?

I think that because I’ve logged a response with curl ifconfig.co/json the router would have “seen” the outbound request, as well as in inbound response?

The static range isn’t within the dhcp range, dhcp is only issuing 192.168.1.100 to 192.168.1.199.

So as of now where I’m at:

  1. all connectivity works over ethernet
  2. outbound connectivity works over wifi
  3. inbound connectivity (e.g. ping, ssh) isn’t working over wifi

I know this is going to be something totally trivial that’s going to have me face-palming when I get to the bottom of it :rofl:

Because all connections are made with the LAN connectors of your router, right? So they act as as switch and you see the devices on the Mac-Address Level.
That you see the Wifi devices too, you need to add the Wifi Adapter (on the router) to a bridge where you add the Lan ports too.

What router are you using? Is this router connected directly to the providers router (wan port)?

That is because it is probably blocked by your router. Ping uses the IPMC protocol. You can use the command traceroute -T then it uses ICP.
If you wana test like ping with traceroute you can use traceroute -I to verify (ICMP)

I do add a picture from the OSI model, to see how the transport of the data packages work. It shows the 7 layers. As long as you work on the first two layers (down to top) you use the physical address of your devices what means the mac-address.

I do hope this helps a bit.

1 Like

Can you connect a monitor the Pi and view what’s going on on the console? That would make debugging this much easier probably.

Since everything else seems to work I would look at the router itself. You also noted that the wifi may be using a random MAC and it seems possible that could be the cause. With workstation on the Pi I do not have wifi set to randomize the mac and I never have seen a problem of any kind.

My thoughts are that the outgoing packets are seen by the router and the replies are returned to the same mac as the connection originated from. However, the router may possibly not store that and when a connection from another source is attempted the router does not know where to send it to.

That may be due to the router itself, or due to the way the random mac is instituted on the Pi with coreOS. Theoretically the random mac should be locked into the network config of the Pi until it is rebooted and is changed, (so the router can always find that mac when it attempts to connect) but I do not use core so I have no direct experience there.

This may boil down to needing to do a tcpdump on the interface (on the pi and on the other host that cannot connect) until you can actually trace where it breaks and identify the cause.

Edit:
I just remembered that on one of my Pi’s the default MAC for the wifi and the ethernet are the same. Since having both attached at the same time to the same router could result in 2 different IPs for the same mac on different interfaces. That could definitely cause problems in communication. ip a would show the actual mac for the interfaces.

I would suggest that you do not connect both at the same time, and/or never connect the ethernet first.

Some router firmware with specific wireless configuration parameters can make wireless clients implicitly isolated by default, rejecting client-to-client traffic:

1 Like

A big thank you to everyone that has contributed so far. Your thoughts and ideas have certainly helped me.

As of today I have almost solved the challenges.

I have created an active/backup team0 interface in Network Manager with a static ip of 192.168.1.10 containing eth0 on dhcp and wlan0 on dhcp, both with autoconnect enabled.

This provides me with a static address for the pi, and should correctly fail over between eth0 and wlan0 if I nmcli connection down eth0 for example.

The approach however has given me one little niggle which I’m currently at a loss to explain. If I physically remove the ethernet cable, or systemctl reboot, it (eth0) does not actually come back up.

nmcli connection show suggests it is up:

NAME         UUID                                  TYPE      DEVICE 
team0        dbce82d5-c824-45a1-b28c-de081aeb3fe9  team      team0  
team0_eth0   811d22bf-d6b5-42cf-854f-328301d879bd  ethernet  eth0   
team0_wlan0  d0af6af8-a819-442d-ab76-3c61d862f851  wifi      wlan0  
eth0         e2cfd23f-01b7-32bc-807e-76e9dbbb6d0f  ethernet  --     
wlan0        b7591e3d-f2b0-47df-8d94-732899ad387e  wifi      -- 

However looking on the physical device there are no lights at the ethernet port.

$ nmcli connection down team0_eth0
Connection 'team0_eth0' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/3)

I get lights on the port here but no connectivity/flashing

$ nmcli connection up team0_eth0
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/4)

Lights go back off.

So it seems the autoconnect of eth0 is not working when included in a team?

Has anyone experienced this / got any suggestions?

Thanks!

It would be ideal if you didn’t have to do any of this and it would all just work :slight_smile:

Though I will say, if I was going to go down the route you did I would use a bond instead of a team. Teaming has a userspace component that might need toggling here. Bonding, IIUC, is all done in the kernel.

1 Like

Thanks again.

I’ve tried bonding and teaming, in both the wifi element disconnects and connects as expected and the ethernet doesn’t.

The only difference physically is that when bonded the ethernet ports lights come back on even though it doesn’t connect, where-as when teamed it doesn’t.

I’m wondering if this is more of a driver issue, so my next test is going to rebuild it using the u-boot method rather than EDK as I currently am and see if that makes a difference. I not expecting it will, but it’ll be an interesting experiment either way.

Cheers