Please consider reinstating dnsmasq in CoreOS

I run Docker Swarm clusters with a front-end entry-point/dispatch going through HAProxy on all Leader Nodes. I am using Swarm’s internal DNS that dynamically names services for service discovery by HAProxy.

When the service does not yet exist in Swarm, Swarm’s internal DNS server does not know of the name yet, so it passes HAProxy’s service name queries to upstream nameservers, who of course have no clue and must always return NXDOMAIN. Swarm also seems to query all of the nameservers available to it at the same time. For non-production Swarms that do not run all of the services, my network’s nameservers are hammered with queries that they can never resolve. Those NXDOMAIN responses quickly add up to thousands of per second. As a systems admin, this level of inefficiency rubs me the wrong way and it drowns out the legit queries in the query log on my nameservers. For developers who run their own development swarms, I fear that they might be hammering their ISP’s nameservers for Swarm service names.

Unfortunately, as far as I can find, Docker has not allowed us to have any control knobs on its internal DNS server. I wish I could simply tell Swarm to never ask upstream nameservers for unqualified names, or names that look like Swarm service names, but I can’t. It is discouraging as I look through Docker’s boards for fixing issues with the internal DNS. It could take years for Docker to even consider allowing more control over its internal Swarm DNS server, so for now, I must hack around it.

I can’t put a caching nameserver inside my HAProxy image, because that is too high in the stack. It needs to get a real-time response for the current availability of services from Swarm, and it also can’t be barred from querying unqualified names, because that’s how most of them look.

I can’t put a caching nameserver as a service in the Swarm network, because Swarm does not let you use static IP addresses for anything on its overlay networks. Doing that creates a chicken-and-egg causality dependency, where Swarm needs to query a service name, but the service it is querying is the upstream naming service itself. I also don’t want to create yet another independent service that lives outside of Swarm and proliferate yet another configuration dependency and another point of failure.

The best solution that I can find is to run dnsmasq natively on my CoreOS nodes, telling it to always return NXDOMAIN for unqualified names without even asking upstream nameservers, and of course, a little bit of caching of external names is a nice bonus. Since this is such a basic network service, to do this cleanly means it needs to live outside of Docker because Docker depends upon it.

Fedora CoreOS uses NetworkManager, which has some pretty slick ways to integrate with dnsmasq. Simply setting dns=dnsmasq in NetworkManager’s configuration automatically shims in a caching DNS layer, gracefully supplanting the existing nameservers /etc/resolv.conf and using the original nameservers for its upstream query forwarding. Simply adding domain-needed to the dnsmasq configuration and unqualified names are filtered.

If I remember correctly, CoreOS originally had dnsmasq as part of its distribution because of issues just like this. But in the interest of reducing image size, dnsmasq was removed because it “seems like it could go.” But that breaks the slick integration that NetworkManager has with it. To me, it “seems like it should stay.” It should be considered a low-level network service that is part of NetworkManager.

I statically compiled dnsmasq, and the 64-bit binary it produced is a whopping 200KB in size (I only turned on DBUS for NetworkManager, every other bell and whistle is turned off). I believe that size is probably not an issue with dnsmasq. I added a configuration to ignition to download my static binary to /usr/local/sbin/dnsmasq, and NetworkManager magically found it and runs it as a plugin. This solved the problem, but I’m not happy with how hacky it is.

I had to do the same thing with snmpd, because running snmpd inside a container to get host metrics is an abstraction-layer nightmare. There are lots of hacks to out there that try to run snmpd in a container with high privileges, but none of them work well enough for me. The snmpd daemon itself is light. My statically-compiled snmpd is only 2.7MB in size. We do not have to distribute MIBs and other SNMP bloat. It is also ubiquitous. Practically any embedded device out there provides an SNMP service. It is a mature core cross-platform network monitoring platform. To make it available in raw OS on CoreOS would be advantageous for those of us who produce metrics from an a vast number of heterogeneous devices.

Statically compiling stuff to get basic services binaries into a distribution that does not have a package manager has a nasty aftertaste for me. A peeve of mine is when package management has such a high learning curve that most users resort to circumventing the package manager to install the things they need. I believe that the 3MB more of rootfs size is well worth having dnsmasq and snmpd available, especially now that CoreOS’s rootfs does not have to be downloaded from PXE.

I would rather not have to statically compile dnsmasq and distribute it to my nodes by ignition to get it into CoreOS. I respectfully request that we please put dnsmasq back, to be available for environments that need it. I also request that we please consider including a light snmpd binary. Both of these things are such basic services that they work best when they are not in containers, much like any other basic daemons that CoreOS runs.

Re-inclusion of dnsmasq is currently under discussion at Need dnsmasq for podman to create CNI networks · Issue #519 · coreos/fedora-coreos-tracker · GitHub
I would recommend that you open another issue on the tracker for the inclusion of snmpd to make sure it is properly tracked and discussed.

Thanks for your kind direction!

Note this came up recently in the IRC channel.

illius  Does Fedora CoreOS allow enabling systemd-resolved or some other kind of local name caching?  I can't seem to find this on the website.
illius  I am facing an issue with Docker Swarm's internal resolver and Swarm service names.
illius  I am using HAProxy and its feature that queries Swarm service names inside the swarm network to discover service availability, but when the services aren't there yet (like in the case of QA and dev environments that might not run all of the services), Swarm passes those unqualified name queries to external nameservers and they are being hammered with unqualified queries that they can never resolve.
illius  Is there a setting I have missed?  It would help big time if I could just tell Swarm's resolver to not forward unqualified dns queries.
@dustymabe      illius: we're moving to systemd-resolved by default in Fedora 33
@dustymabe      you currently can't use it.
illius  Thanks dustymabe.  Do you or does anyone else here have any suggestions on how they addressed this issue?
@dustymabe      illius: how bad is the problem? can you deal with it for another few weeks (when our first f33 build in the `next` stream should be out)?
@dustymabe      have you been running FCOS and just now bumping up into this, or are you just starting to use FCOS?
illius  Oh wow, is f33 coming that soon?  That could work.  I'm not even sure I can get systemd-resolved to filter unqualified queries.  I imagine I'd have to rely on it caching NXDOMAIN answers.
@dustymabe      I can share a build with you now (if you let me know what platform you're running), but it's totally raw :)
@dustymabe      you would want to make sure to just use it for testing purposes
illius  Sorry for the delay -- I was interrupted by a phone call.
illius  I migrated to Fedora CoreOS from the old CoreOS earlier this year when it was announced.  I am running probably 11 swarm clusters based on Fedora CoreOS, including a production Swarm.  I recently updated to the new rootfs thing, though I had an issue with docker logging in the next stream that actually put stuff in the rootfs.
illius  I can probably just wait for F33 to come down the pipe.

@froy, would systemd-resolvd help solve the problem, like illius thinks it will? It will be there and enabled in Fedora 33.

Thank you @dustymabe. Ha ha. Actually, illius is me. This could be considred a continuation of that same IRC conversation. :slight_smile:

Since your kind conversation with me on IRC, more Swarms have been created on my systems, so my efforts to find a solution were expedited.

I am unaware that systemd-resolved allows as much control as dnsmasq does. I am unaware that I can configure systemd-resolved to reply with NXDOMAIN for all unqualified queries like I can with dnsmasq. While systemd-resolved might help, and while it does grant some knobs to turn (Swarm’s internal DNS -the culprit- gives no control whatsoever), I have personally had many issues with systemd-resolved over the years, while I have never had trouble with dnsmasq. I believe that dnsmasq is a more mature, stable, independent, clean, efficient, elegant, and configurable product.

Do you also mean to imply that CoreOS will also start using systemd-networkd instead of NetworkManager? NetworkManager considers dnsmasq as a plugin that extends its capabilities. If FCOS keeps NetworkManager, I believe dnsmasq should be included as part of it. If systemd-networkd is slated to take over the Fedora world, then I guess the battle is lost. :frowning:

This I know for certain: NetworkManager calling dnsmasq as a plugin is a really slick way to fix my issue with Swarm’s unconfigurable anarchic internal DNS server.

Ha! It is you!

We’ll bring Need dnsmasq for podman to create CNI networks · Issue #519 · coreos/fedora-coreos-tracker · GitHub up again today at the community meeting.

1 Like

Outcome of the meeting today: Need dnsmasq for podman to create CNI networks · Issue #519 · coreos/fedora-coreos-tracker · GitHub

1 Like