I have several F36/F37 machines running NAT, and after upgrade some of them don’t want to resolve DNS. I can ping from dhcp clients and I can also dig/nislookup from clients, but internet pages aren’t accessible due to DNS errors.
I had to change named.conf, to exclude
It’s interesting that on some machines is still working, and I can’t find difference in configuration that is causing this.
I’ve also disabled firewall and same issues.
routing seems OK, also as it hasn’t changed and was working perfectly in F36 on all machines.
I’ve also noticed that couple of machine don’t let me connect to localhost DNS namely dig fb.com @localhost will show unreachable so I’ve put local ip and it works ie dig fb.com @192.168.10.1 and DHCP clients also get this same IP as router and DNS
systemd-resolved is optionally able to do dnssec validation, so is this not properly configured in the nameserver, hosts will fail. But: in F36 was systemd-resolved also present, so it’s unclear why this happens on some machines.
Thre is no reason why dig fb.com@localhost would work, unless the client has dnsmasq cache or systemd-resolved configured with DNSStubListenerExtra = 127.0.0.1
systemd-resolved listens on 127.0.0.53, so dig fb.com@127.0.0.53 works.
Command to check systemd-resolved is “resolvectl”, config is /etc/systemd/resolved.conf.
resolv.conf is linked properly so this is not an issue.
But maybe is and issue with 127.0.0.1 as it looks that bind runs on 127.0.0.53, and I keep changing this in resolv.conf (nameserver 127.0.0.1) and also have an entry in named.conf
listen-on port 53 { 127.0.0.1; 192.168.10.1; 1.1.1.1; };
So I should put listen also to 127.0.0.53 instead for 127.0.0.1
interestingly if I just do for example dig fb.com it displays and answer from 127.0.0.1 which I have set up as default DNS
Is bind used as caching nameserver? Then it is a bit double because systemd-resolved is already a caching nameserver. If it serves an internal domain then it is another story.
you should not change this in /etc/resolv.conf because this is a dynamically generated file. Instead, tell systemd-resolved to listen on 127.0.0.1 by entering DNS=127.0.0.1 in /etc/systemd/resolved.conf. I tested it with dnsmasq and systemd-resolved coexisting and it works. dig @127.0.0.1 and dig @127.0.0.53 both work.
Take care that the nameserver is not interface-specific overwritten by DHCP, check resolvcectl.
$ ip route show
default via 192.168.4.1 dev wlp4s0 proto dhcp src 192.168.4.111 metric 600
192.168.4.0/22 dev wlp4s0 proto kernel scope link src 192.168.4.111 metric 600
192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 linkdown
Without doing anything manually access just works
I also use a /etc/hosts file that is default except for added hosts on the local LAN.
Oops, I digged a bit in the named reference manual. Listen-on has nothing to do with upstream nameservers, but defines the addresses where the program is listening on. So 127.0.0.1 is fine, dig@127.0.0.1 should work, 192.168.10.1 should be the address of the LAN interface, and 1.1.1.1 is impossible because this is a foreign address (that confused me). Unless you want to prevent android clients phoning Google by adding 1.1.1.1 to the LAN interface adresses.
systemd-resolved operates on 127.0.0.53, so this should not conflict.
Take care that client DNS requests are NOT using /etc/resolv.conf, but are directed by /etc/nsswitch.conf to systemd-resolved, so check “resolvectl query ” instead of dig.
And if you have a nice local nameserver, set on the nameserver DNS=127.0.0.1 in /etc/systemd/resolved.conf and prevent DHCP to add nameservers by using “DHCP addresses only” in NetworkManager.
Something else: it might be that startup of systemd-resolved has to be delayed “after” named/bind in systemd. But I’ve no idea what consequences this has for other services. For the rest: check the logs whether everything comes up in working state.
Thanks for the update. You are right dig xxx @127.0.0.1 works. I’ve tried replacing 127.0.0.1 with 127.0.0.53 in the Listen-on and also as default nameserver, but didn’t solve the problem. The IP 1.1.1.1 is my public IP on the server (I’ve change it in the post) , as it also runs public DNS for my domains.
There must be some more changes between 9.16 and 9.18 bind versions. I"ve read through release notes, but didn’t find a connections.
I’ve managed to get some resolving through also by modifying named.conf with defining acl rather than define ranges in allow-query. I do still have timeouts of DNS addresses for some websites
Thanks, that explains the confusion. And I made a second mistake, 1.1.1.1 is not google but Cloudfare DNS. If named does all lookup by itself to forwarders or root servers, it is autonomous, and the problem indeed points in the direction of named.
The Listen-On should definitively point to 127.0.0.1, the 127.0.0.53 is owned by systemd-resolved.
Rough test could be stopping and masking systemd-resolved, causing DNS on the machine itself to fail. If the problems with remote DNS stay the same, sytemd-resolved is not involved…
I’m still struggling with DNS issuses. I have 6+ F37 servers running in different networks, and some work fine some don’t, and they all use named as local DNS and also act as secondary DNS for some of the domains that I host.
I even did 2x podman install F36 with named 9.16 version and on one server DHCP clients work fine now, but on the other one not. I keep more or less all named.conf files nearly identical, so I can’t figure out what is causing this time out. But eventhough server is v 9.16., client (dig) is still version 9.18., so it’s not a full solution anyway.
For example when client does F5 (refresh web page) a few times, then it will get proper DNS name, and page will show up.
I’ve also noted that when I do dnf -y up on F37 server it will time-out many mirrors, so obviously something is blocking domain name resolutions, and it’s not firewall. Even when I do manually dig somedomain.com I get timeout, but normally the second time I get the result.
I’ve played with resolvectl status, but I mostly get same output on all servers. Resolve service is also active, so probably something within named configuration is causing problems.
Yes that is a typical /etc/resolv.conf for dhcp client.
I’ve tried putting in
nameserver 127.0.0.53
or
nameserver 127.0.0.1
or my local ip but not much changed.
I understand that putting nameserver 127.0.0.53 will use DNS from DHCP server, which would be my ISP, but than my DNS will not be responding to requests.
Actually I think the local (named) name server should be listening on the local LAN IP of the name server and be functioning as both local name server and caching name server. When it fails to provide the address then it should forward the request to an external name server.
Your playing with resolv.conf and the like is related to the local machine and not to the config for named.
Some things to consider are the scope of services provided by dns (systemd-resolved) on the local machine (local only) and named (dns services to external clients). Each needs to be configured appropriately for the scope of services it covers.
If you have a relatively small LAN where you are using dhcp (even with different subnets) and have control of the routers it would be simpler to have the dhcp server (router) control assigning the name server addresses and possibly even all the routing to those machines. Then named on the name server(s) controls dns services for the external clients and systemd-resolved manages dns on the local machine and passes on to the designated next point (the router) when it does not have the address requested.
Would it not be easier to manage if you used one host as name server for your entire network?
No as I have multiple small networks in different locations, so it doesn’t make sense to have separate servers for some services, so my F37 box, does it all (routing, Firewall, DHCP, DNS, SMB, FTP, MAIL, HTTP)
I always include 127.0.0.1local-ippublic-ip in the listen, and it worked for many years fine, and also very quick. DHCP server sends my local-ip and public-ip to it’s clients forr DNS
I thought that when dhcp client ask DNS on local-ip, than server responds firstly from local-ip then proceeds with other DNS listed in resolv.conf.
I believe they are configured properly, as it was working flawlessly more/less since F20 on most setups, or longer. Every once in a while some directive in named.conf it has to be updated, like with F37 (named 9.18) those:
were commented out to be able to start named.
But I’m not even sure any more is it really named that is giving me timeout issues or resolved or firewall or something else.
Clients do not connect a different name server on 127.0.0.1 (localhost). Instead they connect to the actual LAN IP address of that name server. If your host has an IP of 192.168.100.2 then the clients that use that name server would connect to 192.168.100.1:53 for DNS service on that name server.
This is what I meant when I said that the client connects to the LAN IP of the name server and that named should listen on the LAN IP of the name server for client connections.
The 127.0.0.1 address is only for local connections within that specific host.
In fact, with dhcp and using systemd the nameserver as provided by the dhcp server is found in /run/systemd/resolve/resolv.conf. This means that each local host will first look to 1) the hosts file, then 2) the name server assigned by the dhcp server.
I am not sure you understand the differences between systemd-resolved and named.
Systemd-resolved acts for resolving dns requests for the local host only and works with the localhost IP (127.0.0.1). It then directs external requests as appropriate to the specified name server. Named provides dns services for external clients and thus must listen on the LAN IP for incoming requests from the external hosts and must have a valid name list for resolution of local dns requests as well as connections to an external name server for world wide dns resolution. Named can be a purely caching name server for dns addresses beyond the local LAN but needs configuration to provide those services for hosts that do not have an internet address and dns services. Systemd-resolved cannot provide external services since it is limited to the local host only.
The scope of services is widely different.
In general if there are only a small number of hosts involved it would be simple to configure the /etc/hosts file on each machine for local dns and rely solely on resolved to provide external dns. If that is too complicated to achieve then a local name server as you indicate is the next step. But again, configuring named is totally different in scope of service and address it listens on to provide those services than with resolved.
If you’re running a DNS server, you’ll need to disable systemd-resolved before setting up BIND or Unbound instead.
If this is still a valid statement, you can stop, disable and mask systemd-resolved, create a classical /etc/resolv.conf pointing to upstream nameservers and see whether the system behaves better. For completeness you can remove
“resolve [!UNAVAIL=return]” from /etc/nsswitch.conf.
Thank you for that update.
I was not aware they were incompatible, but did know as stated above that there is a wide difference in purpose and scope.
You can replace the link at /etc/resolv.conf with an actual file and probably overcome the incompatibility of function since that link leads to to the dynamic tools run by systemd. This fix has been discussed many times and seems to work, although I am not aware of any earlier discussion with using named (bind) on a system running systemd-resolved.
I’ve stopped resolved (systemctl stop systemd-resolved.service), and also restarted named. My /etc/resolv.conf stayed with proper NS, but I still experience timeouts
See output (only 3rd or 4th time it responds normally). So disabling resolved didn’t make any difference.
[root@z4 etc]# dig fb.com
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out
; <<>> DiG 9.18.8 <<>> fb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 16568
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 7f80f7abcaf31fad0100000063a224adba1f530180f893e8 (good)
;; QUESTION SECTION:
;fb.com. IN A
;; Query time: 1 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Dec 20 22:10:05 CET 2022
;; MSG SIZE rcvd: 63
[root@z4 etc]# dig fb.com
;; communications error to 127.0.0.1#53: timed out
; <<>> DiG 9.18.8 <<>> fb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13741
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 3c573c33c07203a90100000063a224b9d4dd5b883e680276 (good)
;; QUESTION SECTION:
;fb.com. IN A
;; Query time: 4994 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Dec 20 22:10:17 CET 2022
;; MSG SIZE rcvd: 63
[root@z4 etc]# dig fb.com
;; communications error to 127.0.0.1#53: timed out
; <<>> DiG 9.18.8 <<>> fb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25081
;; flags: qr rd ra; QUERY: 1, ANSWER: 17, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 1f0e4c241a2e6d1b0100000063a224cc67b8bdee8541f748 (good)
;; QUESTION SECTION:
;fb.com. IN A
;; ANSWER SECTION:
fb.com. 300 IN A 31.13.65.36
fb.com. 300 IN A 157.240.254.35
fb.com. 300 IN A 31.13.93.35
fb.com. 300 IN A 157.240.11.35
fb.com. 300 IN A 157.240.229.35
fb.com. 300 IN A 31.13.66.35
fb.com. 300 IN A 157.240.14.35
fb.com. 300 IN A 157.240.3.35
fb.com. 300 IN A 157.240.249.35
fb.com. 300 IN A 157.240.24.35
fb.com. 300 IN A 157.240.22.35
fb.com. 300 IN A 157.240.19.35
fb.com. 300 IN A 31.13.67.35
fb.com. 300 IN A 157.240.241.35
fb.com. 300 IN A 31.13.88.35
fb.com. 300 IN A 31.13.70.36
fb.com. 300 IN A 31.13.71.36
;; Query time: 1152 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Dec 20 22:10:36 CET 2022
;; MSG SIZE rcvd: 335
[root@z4 etc]# dig fb.com
; <<>> DiG 9.18.8 <<>> fb.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34045
;; flags: qr rd ra; QUERY: 1, ANSWER: 17, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: ca843369ffbdd79d0100000063a224d1209b36fd0340c5ce (good)
;; QUESTION SECTION:
;fb.com. IN A
;; ANSWER SECTION:
fb.com. 295 IN A 31.13.67.35
fb.com. 295 IN A 157.240.241.35
fb.com. 295 IN A 31.13.66.35
fb.com. 295 IN A 31.13.65.36
fb.com. 295 IN A 31.13.88.35
fb.com. 295 IN A 157.240.14.35
fb.com. 295 IN A 31.13.93.35
fb.com. 295 IN A 157.240.19.35
fb.com. 295 IN A 31.13.71.36
fb.com. 295 IN A 157.240.3.35
fb.com. 295 IN A 157.240.11.35
fb.com. 295 IN A 157.240.24.35
fb.com. 295 IN A 31.13.70.36
fb.com. 295 IN A 157.240.229.35
fb.com. 295 IN A 157.240.254.35
fb.com. 295 IN A 157.240.249.35
fb.com. 295 IN A 157.240.22.35
;; Query time: 8 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Dec 20 22:10:41 CET 2022
;; MSG SIZE rcvd: 335
It’s probably just a small configuration change needed at some place… but where
it is some sort of timeout issues, as regardless if I dig ab.cd @127.0.0.1 or dig ab.cd @192.168.X.1 or my public IP, or even if I dig it from different servers, it does a few timeouts and than works normally.
May be running named in foreground with some debug level enabled gives info about what’s going on? It has to do some work to collect information from scratch, may be this shows the problem? With “-d” you define a debug level, “-f” prevents going to background and “-g” activates logging on terminal. Not talking from significant experience with this program, I assume this should give some live view about what’s going on…
Thanks, When I look through the existing log is just so many log entries, it’s hard to follow, mostly timeouts
Dec 21 09:30:42 z4 named[1962438]: timed out resolving 'api.mixpanel.com/AAAA/IN': 2001:4860:4802:36::6a#53
Dec 21 09:30:45 z4 named[1962438]: timed out resolving 'api.mixpanel.com/A/IN': 2001:4860:4802:38::6a#53
Dec 21 09:30:45 z4 named[1962438]: timed out resolving 'nlb-sn-b4adbf8f03275516.elb.us-east-1.amazonaws.com/AAAA/IN': 2600:9000:5301:1900::1#53
Dec 21 09:30:48 z4 named[1962438]: timed out resolving 'nlb-sn-b4adbf8f03275516.elb.us-east-1.amazonaws.com/AAAA/IN': 2600:9000:5305:600::1#53
Dec 21 09:30:59 z4 named[1962438]: timed out resolving 'api.mixpanel.com/A/IN': 2001:4860:4802:32::6a#53
Dec 21 09:31:20 z4 named[1962438]: timed out resolving 'contile.services.mozilla.com/AAAA/IN': 2600:9000:5306:7700::1#53
Dec 21 09:31:20 z4 named[1962438]: timed out resolving 'contile.services.mozilla.com/A/IN': 2600:9000:5306:7700::1#53
Dec 21 09:31:23 z4 named[1962438]: timed out resolving 'contile.services.mozilla.com/AAAA/IN': 2600:9000:5301:200::1#53
Dec 21 09:31:24 z4 named[1962438]: timed out resolving 'contile.services.mozilla.com/A/IN': 2600:9000:5301:200::1#53
Dec 21 09:31:26 z4 named[1962438]: timed out resolving 'contile.services.mozilla.com/AAAA/IN': 2600:9000:5305:bf00::1#53
I’ve tried, and enabled only ipv4, which decreased number of errors /usr/sbin/named -4 -u named -d 8 -f -c /etc/named.conf
But some domains show an error, whereas if it dig it from ISP’s DNS it works: dig dbs.si
Dec 21 09:59:46 z4 named[2446379]: managed-keys-zone: DNSKEY set for zone '.' could not be verified with current keys
Dec 21 10:05:26 z4 named[2446379]: DNS format error from 195.47.224.12#53 resolving dbs.si/A for 127.0.0.1#35523: server sent FORMERR
Dec 21 10:05:26 z4 named[2446379]: received FORMERR resolving 'dbs.si/A/IN': 195.47.224.12#53
I’ve seen a post about managed-keys-zone not refreshing so I moved /var/named/dynamic/managed-keys.bind and /var/named/dynamic/managed-keys.bind.jnl and restarted named, but no change.