I have a slightly strange DNS issue which I am fairly sure is to do with DNSSEC- essentially there are a small number of sites which give a timeout when resolving at localhost but succeed when resolving via google or with dnssec-validation disabled. The vast majority of lookups are fine, there are just a tiny handful which “fail”.
As an example the following is OK (just to show resolving locally does indeed work):
byteplayer:~=> dig @127.0.0.1 google.com
; <<>> DiG 9.18.28 <<>> @127.0.0.1 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21423
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 42d13e382d75031c010000006704e99d828b86e7c527deb2 (good)
;; QUESTION SECTION:
;google.com. IN A;; ANSWER SECTION:
google.com. 18 IN A 142.250.178.14;; Query time: 27 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Oct 08 09:13:17 BST 2024
;; MSG SIZE rcvd: 83
If I try to resolve www.t3.com though I get:
byteplayer:~=> dig @127.0.0.1 www.t3.com
;; communications error to 127.0.0.1#53: timed out
;; communications error to 127.0.0.1#53: timed out; <<>> DiG 9.18.28 <<>> @127.0.0.1 www.t3.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 64176
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 8e039ae4ee562111010000006704ea867c61cb0033eba818 (good)
;; QUESTION SECTION:
;www.t3.com. IN A;; Query time: 2023 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Oct 08 09:17:10 BST 2024
;; MSG SIZE rcvd: 67
But if I resolve that direct to Google’s DNS it works:
byteplayer:~=> dig @8.8.8.8 www.t3.com
; <<>> DiG 9.18.28 <<>> @8.8.8.8 www.t3.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 182
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;www.t3.com. IN A;; ANSWER SECTION:
www.t3.com. 256 IN CNAME trhmb96.ng.impervadns.net.
trhmb96.ng.impervadns.net. 30 IN A 45.223.102.77;; Query time: 20 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Tue Oct 08 09:18:03 BST 2024
;; MSG SIZE rcvd: 94
I think it’s something to do with dnssec as if I turn off dnssec-validation then all is good:
byteplayer:~=> dig @127.0.0.1 www.t3.com
; <<>> DiG 9.18.28 <<>> @127.0.0.1 www.t3.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58827
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 3e9b58f47f0e41cc010000006704eb355daf3401dd7adfb7 (good)
;; QUESTION SECTION:
;www.t3.com. IN A;; ANSWER SECTION:
www.t3.com. 177 IN CNAME trhmb96.ng.impervadns.net.
trhmb96.ng.impervadns.net. 23 IN A 45.223.102.77;; Query time: 25 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Tue Oct 08 09:20:05 BST 2024
;; MSG SIZE rcvd: 122
I have tried increasing logging levels which doesn’t really help:
08-Oct-2024 09:25:26.472 queries: info: client @0x7f537aa2e168 127.0.0.1#50144 (www.t3.com): query: www.t3.com IN A +E(0)K (127.0.0.1)
08-Oct-2024 09:25:31.474 queries: info: client @0x7f537aa30168 127.0.0.1#53253 (www.t3.com): query: www.t3.com IN A +E(0)K (127.0.0.1)
08-Oct-2024 09:25:36.477 queries: info: client @0x7f537aa32168 127.0.0.1#38076 (www.t3.com): query: www.t3.com IN A +E(0)K (127.0.0.1)
08-Oct-2024 09:25:38.536 resolver: info: shut down hung fetch while resolving ‘trhmb96.ng.impervadns.net/A’
08-Oct-2024 09:25:38.536 query-errors: info: client @0x7f537aa2e168 127.0.0.1#50144 (www.t3.com): query failed (operation canceled) for www.t3.com/IN/A at …/…/…/lib/ns/query.c:7842
08-Oct-2024 09:25:38.536 query-errors: info: client @0x7f537aa30168 127.0.0.1#53253 (www.t3.com): query failed (operation canceled) for www.t3.com/IN/A at …/…/…/lib/ns/query.c:7842
08-Oct-2024 09:25:38.536 query-errors: info: client @0x7f537aa32168 127.0.0.1#38076 (www.t3.com): query failed (operation canceled) for www.t3.com/IN/A at …/…/…/lib/ns/query.c:7842
08-Oct-2024 09:25:38.549 resolver: info: shut down hung fetch while resolving ‘impervadns.net/DNSKEY’
I realise there are various CN names and so with this example but the essence is WHY does resolving direct to Google work but resolving locally (which also goes to Google!) fail?
My named.conf file looks as follows:
listen-on port 53 { any; };
version none;
directory “/var/named”;
dump-file “/var/named/data/cache_dump.db”;
statistics-file “/var/named/data/named_stats.txt”;
memstatistics-file “/var/named/data/named_mem_stats.txt”;
secroots-file “/var/named/data/named.secroots”;
recursing-file “/var/named/data/named.recursing”;
recursion yes;
allow-recursion { goodclients; };
allow-query { goodclients; };
allow-query-cache { goodclients; };dnssec-validation auto;
managed-keys-directory “/var/named/dynamic”;
pid-file “/run/named/named.pid”;
session-keyfile “/run/named/session.key”;/* Changes/CryptoPolicy - Fedora Project Wiki */
include “/etc/crypto-policies/back-ends/bind.config”;forwarders {
8.8.8.8; # Google DNS
8.8.4.4; # Google secondary DNS
};
forward only; # Ensure BIND only forwards queries};
/* logging {
channel default_debug {
file “data/named.run”;
severity dynamic;
};
};
*/logging {
channel default_debug {
file “/var/named/data/named.log” versions 3 size 20m;
severity dynamic;
print-time yes;
print-severity yes;
print-category yes;
};
category default { default_debug; };
category queries { default_debug; };
category security { default_debug; };
};zone “.” IN {
type hint;
file “/var/named/named.ca”;
};include “/etc/named.rfc1912.zones”;
I have spent hours on this but not making any progress. The strange thing is that the number of lookups that fail is really small. As this is just a caching name server with forwarding I am at a loss as to why these few when dnssec is enabled are timing out. Another example is grd.bk .
Any guidance/advice (or if someone can simply check if this problem exists on your set-ups, it would be appreciated.