Top
Best
New

Posted by hyperknot 10/26/2024

Understanding Round Robin DNS(blog.hyperknot.com)
394 points | 123 commentspage 3
jkrauska 10/26/2024|
Check out what happens when you use IPv6 addresses. RFC 6724 is awkward about ordering with IPv6.

How your OS sorts DNS responses also comes in to play. Depends on what your browser makes DNS requests.

cybice 10/26/2024||
Cloudflare results with worker as a reverse proxy can be much better.
easylion 10/26/2024|
But won't it add an additional hop hence additional latency to every single request ?
rodcodes 10/26/2024||
Nah, because the Cloudflare Workers run at closest edge location and are real fast.

The real solution with Cloudflare is to use their Load Balancing (https://developers.cloudflare.com/load-balancing) which is a paid feature.

bar000n 10/26/2024||
hey! so i got a cdn for video made of 4 bare metals and 2 are newer and more powerful so i give them each 2 ip addresses from the 6 addresses replied by dns for the respective a record. but from a very diverse pool of devices (proprietary set top boxes, smart tv sets, mobile clients ios and android, web browsers, etc) i still get ~40% of traffic on the older servers instead of the expected 33% given 2 out of 6 ip addresses resolved as dns a records for these hosts. why?
urbandw311er 10/26/2024||
What a great article! It’s often easy to forget just how flexible and self-correcting the “official” network protocols are. Thanks to the author for putting in the legwork.
backtoyoujim 10/27/2024||
"I wrote a decoder in Perl. Everything must be in Perl."

preach on.

rebelde 10/26/2024||
I have use round robin for years.

Wish I could add instructions like:

- random choice #round robin, like now

- first response # usually connects to closest server

- weights (1.0.0.1:40%; 2.0.0.2:60%)

- failover: (quick | never)

- etc: naming countries, continents

tiahura 10/26/2024||
Back in the day DNS consumed a lot more oxygen - Bind, double-reverse mx records, windows dns, etc. What happened? Did cloud make all of that go away?
stackskipton 10/26/2024||
As SRE, I get a chuckle out of this article and some of the responses. Devs mess this up constantly.

DNS has one job. Hostname -> IP. Nothing further. You can mess with it on server side like checking to see if HTTP server is up before delivering the IP but once IP is given, the client takes over and DNS can do nothing further so behavior will be wildly inconsistent IME.

Assuming DNS RR is standard where Hostname returns multiple IPs, then it's only useful for load balancing in similar latency datacenters. If you want fancy stuff like geographic load balancing or health checks, you need fancy DNS server but at end of day, you should only return single IP so client will target the endpoint you want them to connect to.

plagiat0r 10/27/2024||
I've implemented a custom powerdns backend that combines heathchecks, weighted probabilistic round robin, and geo DNS and it works excellent to build and auto healing CDN.

It was specifically built for multi DC or multi cloud or hybrid operations that are on separate continents, with geo DNS, heathchecks and faiolver on the DNS level at the same time. When all usa servers in the WRR pool are down, or DC is down, it starts to answers the closest next set of WRR (Canada) automatically.WRR pools are dynamic and auto healing, constantly doing http heathchecks.

It is also dirt cheap, like 100x cheaper as opposed to aquire provider independent IP address space and run and operate AnyCast and having 24/7 NOC teams on this AnyCast, constantly adjusting bgp communities etc. and it is not like anycast and bgp solve anything when one server is down but other works. You can't stop announcing whole prefix if you run 200 machines but only one or two are down.

TTL I'm using is 30 seconds.

I never shared this backed with the world, you can't test it or purchase it. But maybe some day I'll launch a route53 competitor ;)

lysace 10/26/2024||
I've never ever come up with a scenario where RR DNS is useful in the goal of achieving high availability. I'm similarly mystified.

What can be useful: dynamically adjusting DNS responses depending on what DC is up. But at this point shouldn't you be doing something via BGP instead? (This is where my knowledge breaks down.)

stackskipton 10/26/2024||
Yea, Anycast IP like what Cloudflare does is the best.

If you want cheaper load balancing and are ok with some downtime while DNS reconfigures, DNS system that returns IP based on which Datacenter is up works. Examples of this are Route53, Azure Traffic Manager and I assume Google has solution, I just don't know what it is.

lysace 10/26/2024||
Worked on implementing a distributed-consensus driven DNS thing like 15 years ago. We had 3 DCs around the world for a very compute-intense but not very stateful service. It actually just worked without any meaningful testing on the first single DC outage. In retrospect I'm amazed.
specto 10/26/2024|
Chrome and Firefox use the OS dns server by default, which in most OS' have caching as well.
More comments...