Due to caching implementations which change between OSes, programming languages, and routers, DNS is actually a really terrible discovery mechanism at scale.
For example, I'd never want to implement service discovery via DNS in Java, which ignores expiration times and caches queries according to its own settings. I've also seen some server configurations which result in every gethostbyname call to a fully recursive query against DNS.
> Due to caching implementations which change between OSes, programming languages, and routers, DNS is actually a really terrible discovery mechanism at scale.
Only if you rely on the client's resolver caching implementation, which we don't. Docker networking guarantees that a given hostname lookup by a given container in a given service discovery scope will always return the same IP. In that configuration, using DNS is a robust fit for production.
Believe it or not, Docker is built by people who have actual operational experience :)
> Docker networking guarantees that a given hostname lookup by a given container in a given service discovery scope will always return the same IP. In that configuration, using DNS is a robust fit for production.
I really wish there was more technical documentation on this "feature", which would make actual discussions about this much more productive, as opposed to what amounts to "trust me" with an appeal to authority.
As it stands now, there's not enough technical documentation about this feature available to make intelligent decisions about whether this really is a "robust fit for production".
> Believe it or not, Docker is built by people who have actual operational experience :)
I believe in track records, and Docker's hasn't been too great so far. From "not our problem" responses to issues like devicemapper version differences (on major OSes, no less), to the fact that I had to automate the cleanup of orphaned mounts after stopping containers, to non-backwards compatible releases with major core re-writes every other month, to containers which are quite literally unkillable without a host restart; Docker's track record is pretty poor.
I've used it in production, and the benefits barely outweighed the pain of attempting to manage it.
> a given hostname lookup by a given container in a given service discovery scope will always return the same IP
Did you mean this to cover the situation where N containers have registered the same name? I thought Docker DNS returned all such results in random order.
Given that, if one of those containers goes offline and Java has cached its address, you have the problem the parent was alluding to.
That's incorrect. You are describing DNS-based load-balancing, which would indeed rely on the container's resolver implementation. But Docker doesn't do that. Instead it always resolves the service name to the same IP, which is load-balanced by IPVS. That way even the world's crappiest dns caching implementation will still be handled properly.
So when I said that Docker doesn't rely on the container's DNS resolver, I really meant it. We have seen in past lives the consequences of "DNS abuse" and have been careful to avoid it.
Docker 1.12 built-in load-balancing supports both VIP based LB using IPVS and also DNS-RR and it is configurable per-service. VIP based LB is the default though. All of these will be fully documented shortly.
Not all use of DNS is the same. If you're attempting service discovery over DNS and need a timely response on changes, any form of caching behavior that does not conform to the specs is going to cause you a headaches when you need it to work most in production.
For example, I'd never want to implement service discovery via DNS in Java, which ignores expiration times and caches queries according to its own settings. I've also seen some server configurations which result in every gethostbyname call to a fully recursive query against DNS.