Top
Best
New

Posted by linolevan 3 hours ago

What came first: the CNAME or the A record?(blog.cloudflare.com)
130 points | 48 comments
steve1977 1 hour ago|
I don't find the wording in the RFC to be that ambiguous actually.

> The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.

The "possibly preface" (sic!) to me is obviously to be understood as "if there are any CNAME RRs, the answer to the query is to be prefaced by those CNAME RRs" and not "you can preface the query with the CNAME RRs or you can place them wherever you want".

mrmattyboy 9 minutes ago||
I agree this doens't seem too ambiguous - it's "you may do this.." and they said "or we may do the reverse". If I say you're could prefix something.. the alternative isn't that you can suffix it.

But also.. the programmers working on the software running one of the most important (end-user) DNS servers in the world:

1. Changes logic in how CNAME responses are formed

2. I assume some tests at least broke that meant they needed to be "fixed up" (y'know - "when a CNAME is queried, I expect this response")

3. No one saw these changes in test behavoir and thought "I wonder if this order is important". Or "We should research more into this", Or "Are other DNS servers changing order", Or "This should be flagged for a very gradual release".

4. Ends up in test environment for, what, a month.. nothing using getaddrinfo from glibc is being used to test this environment or anyone noticed that it was broken

Cloudflare seem to be getting into thr swing of breaking things and then being transparent. But this really reads as a fun "did you know", not a "we broke things again - please still use us".

There's no real RCA except to blame an RFC - but honestly, for a large-scale operation like there's this seems very big to slip through the cracks.

I would make a joke about South Park's oil "I'm sorry".. but they don't even seem to be

devman0 4 minutes ago|||
Even if 'possibly preface' is interpreted to mean CNAME RRSets should appear first there is still a broken reliance by some resolvers on the order of CNAME RRsets if there is more than one CNAME in the chain. This expectation of ordering is not promised by the relevant RFCs.
inopinatus 47 minutes ago|||
The article makes it very clear that the ambiguity arises in another phrase: “difference in ordering of the RRs in the answer section is not significant”, which is applied to an example; the problem with examples being that they are illustrative, viz. generalisable, and thus may permit order ambiguity everywhere, and in any case, whether they should or shouldn’t becomes a matter of pragmatic context.

Which goes to show, one person’s “obvious understanding” is another’s “did they even read the entire document”.

All of which also serves to highlight the value of normative language, but that came later.

a7b3fa 52 minutes ago|||
I agree with you, and I also think that their interpretation of example 6.2.1 in the RFC is somewhat nonsensical. It states that “The difference in ordering of the RRs in the answer section is not significant.” But from the RFC, very clearly this comment is relevant only to that particular example; it is comparing two responses and saying that in this case, the different ordering has no semantic effect.

And perhaps this is somewhat pedantic, but they also write that “RFC 1034 section 3.6 defines Resource Record Sets (RRsets) as collections of records with the same name, type, and class.” But looking at the RFC, it never defines such a term; it does say that within a “set” of RRs “associated with a particular name” the order doesn’t matter. But even if the RFC had said “associated with a particular combination of name, type, and class”, I don’t see how that could have introduced ambiguity. It specifies an exception to a general rule, so obviously if the exception doesn’t apply, then the general rule must be followed.

Anyway, Cloudflare probably know their DNS better than I do, but I did not find the article especially persuasive; I think the ambiguity is actually just a misreading, and that the RFC does require a particular ordering of CNAME records.

(ETA:) Although admittedly, while the RFC does say that CNAMEs must come before As in the answer, I don’t necessarily see any clear rule about how CNAME chains must be ordered; the RFC just says “Domain names in RRs which point at another name should always point at the primary name and not the alias ... Of course, by the robustness principle, domain software should not fail when presented with CNAME chains or loops; CNAME chains should be followed”. So actually I guess I do agree that there is some ambiguity about the responses containing CNAME chains.

taeric 41 minutes ago|||
Isn't this literally noted in the article? The article even points out that the RFC is from before normative words were standardized for hard requirements.
paulddraper 1 hour ago||
100%

I just commented the same.

It's pretty clear that the "possibly" refers to the presence of the CNAME RRs, not the ordering.

patrickmay 2 hours ago||
A great example of Hyrum's Law:

"With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody."

combined with failure to follow Postel's Law:

"Be conservative in what you send, be liberal in what you accept."

mmastrac 1 hour ago||
Postel's law is considered more and more harmful as the industry evolved.
CodesInChaos 1 hour ago|||
That depends on how Postel's law is interpreted.

What's reasonable is: "Set reserved fields to 0 when writing and ignore them when reading." (I heard that was the original example). Or "Ignore unknown JSON keys" as a modern equivalent.

What's harmful is: Accept an ill defined superset of the valid syntax and interpret it in undocumented ways.

n2d4 1 hour ago||||
Very much so. A better law would be conservative in both sending and accepting, as it turns out that if you are liberal in what you accept, senders will choose to disobey Postel's law and be liberal in what they send, too.
esafak 1 hour ago|||
I think it is okay to accept liberally as long as you combine it with warnings for a while to give offenders a chance to fix it.
hdjrudni 1 hour ago|||
"Warnings" are like the most difficult thing to 'send' though. If an app or service doesn't outright fail, warnings can be ignored. Even if not ignored... how do you properly inform? A compiler can spit out warnings to your terminal, sure. Test-runners can log warnings. An RPC service? There's no standard I'm aware of. And DNS! Probably even worse. "Yeah, your RRs are out of order but I sorted them for you." where would you put that?
diarrhea 10 minutes ago|||
Randomly fail or (increasingly) delay a random subset of all requests.
esafak 1 hour ago|||
> how do you properly inform?

Through the appropriate channels; in-band and out-of-band.

psnehanshu 1 hour ago||||
Warnings are ignored. It's much better to fail fast.
dotancohen 1 hour ago|||
The Python 3 community was famously divided on that matter, wrt Python 3. Now that it is over, most people on the "accept liberally" side of the fence have jumped sides.
ajross 33 minutes ago||
That's true, but sort of misses the spirit of Hyrum's law (which is that the world is filled with obscure edge cases).

In this case the broken resolver was the one in the GNU C Library, hardly an obscure situation!

The news here is sort of buried in the story. Basically Cloudflare just didn't test this. Literally every datacenter in the world was going to fail on this change, probably including their own.

NelsonMinar 1 hour ago||
It's remarkable that the ordinary DNS lookup function in glibc doesn't work if the records aren't in the right order. It's amazing to me we went 20+ years without that causing more problems. My guess is most people publishing DNS records just sort of knew that the order mattered in practice, maybe figuring it out in early testing.
pixl97 1 hour ago||
I think it's more of a server side ordering, in which there were not that many DNS servers out there, and the ones that didn't keep it in order quickly changed the behavior because of interop.

CNAMES are a huge pain in the ass (as noted by DJB https://cr.yp.to/djbdns/notes.html)

silverwind 1 hour ago||
It's more likely because the internet runs on a very small number of authorative server implementations which all implement this ordering quirk.
runningmike 14 minutes ago||
The end of this blog is …. “ To learn more about our mission to help build a better Internet,”

Reminds me of https://news.ycombinator.com/item?id=37962674 or see https://tech.tiq.cc/2016/01/why-you-shouldnt-use-cloudflare/

sebastianmestre 1 hour ago||
I kind of wish they start sending records in randomized order to take out all the broken implementations that depend on such a fragile property
seiferteric 55 minutes ago||
Now that I have seemingly taken on managing DNS at my current company I have seen several inadequacies of DNS that I was not aware of before. Main one being that if an upstream DNS server returns SERVFAIL, there is no distinction really between if the server you are querying is failed, or the actual authoritative server upstream is broken (I am aware of EDEs but doesn't really solve this). So clients querying a broken domain will retry each of their configured DNS servers, and our caching layer (Unbound) will also retry each of their upstreams etc... Results in a bunch of pointless upstream queries like an amplification attack. Also have issue with the search path doing stupid queries with NXDOMAIN like badname.company.com, badname.company.othername.com... etc..
mdavid626 28 minutes ago||
I would expect, that dns servers like 1.1.1.1 at this scale have integration tests running real resolvers, like the one in glibc. How come this issue was discovered only in production?
forinti 1 hour ago||
> While in our interpretation the RFCs do not require CNAMEs to appear in any particular order, it’s clear that at least some widely-deployed DNS clients rely on it. As some systems using these clients might be updated infrequently, or never updated at all, we believe it’s best to require CNAME records to appear in-order before any other records.

That's the only reasonable conclusion, really.

hdjrudni 1 hour ago|
And I'm glad they came to it. Even if everyone else is wrong (I'm not saying they are) sometimes you just have to play along.
danepowell 19 minutes ago||
Doesn't this optimize memory on the DNS server at the expense of additional memory usage across millions of clients that now need to parse an unordered response?
Dylan16807 6 minutes ago|
The memory involved is a kilobyte. The optimization isn't important anywhere. The fragility is what's important.

Also no, the client doesn't need more memory to parse the out-of-order response, it can take multiple passes through the kilobyte.

tuetuopay 1 hour ago|
Many rightfully interpret the RFC as "CNAME have to be before A", but the issue persists inbetween CNAMEs in the chain as noted in the article. If a record in the middle of the chain expires, glibc would still fail if the "middle" record was to be inserted between CNAMEs and A records.

It’s always DNS.

More comments...