Comment by kayson

21 days ago

> However, we did not have any tests asserting the behavior remains consistent due to the ambiguous language in the RFC.

Maybe I'm being overly-cynical but I have a hard time believing that they deliberately omitted a test specifically because they reviewed the RFC and found the ambiguous language. I would've expected to see some dialog with IETF beforehand if that were the case. Or some review of the behavior of common DNS clients.

It seems like an oversight, and that's totally fine.

I took it as being "we wrote the tests to the standard" and then built the code, and whoever was writing the tests didn't read that line as a testable aspect.

My reading of that statement is their test, assuming they had one, looked something like this:

    rrs = resolver.resolve('www.example.test')
    assert Record("cname1.example.test", type="CNAME") in rrs
    assert Record("192.168.0.1", type="A") in rrs

Which wouldn't have caught the ordering problem.

  • It's implied that they intentionally tested it that way, without any assertions on the order. Not by oversight of incompetence, but because they didn't want to bake the requirement in due to uncertainty.

    • That would be silly to stick that tightly to a 40 year old standard. They can easily observe the behavior of every other public DNS resolver (they are Cloudflare, so gathering data on such a scale should be easy) and see how they return results.

      Honestly, though, I’d be surprised if they actually even considered it. Everything about the article says to me that the engineer(s) who caused this problem are desperately trying to deflect blame for not having a comprehensive test suite. Sorry, but you don’t go tweaking order of results for such a long-standing, high volume, and crucial protocol just because the 40 year old spec isn’t clear about it.

    • That approach only makes sense if tests are immutable though. If you are unsure if the order matters you should still test for it so you get a reminder to re-check your assumptions when the order changes.

its pretty concerning that such a large organisation doesnt do any integration tests with their dns infrastructure