Comment by jeffbee
1 day ago
I don't see how it would be possible to produce this table under Quad9's privacy policy. Nothing in their privacy policy says that they maintain logs that would enable them to count queries by label. Can anyone explain?
It does say that they collect this information in their “Data and Privacy Policy”. Specifically section 2.2 (Data Collected): https://quad9.net/privacy/policy/
Which policy are you referring to that implies they don’t?
Also I think you are assuming they store query logs and then aggregate this data later. It is much simpler just to maintain an integer counter for monitoring as the queries come in, and ingest that into a time series database (not sure if that’s what they actually do). Maybe it needs to be a bit fancier to handle the cardinality of DNS names dimension, but re-constructing this from logs would be much more expensive.
The section you mentioned does not say anything about having counters for labels. It only mentions that they record "[t]he times of the first and most recent instances of queries for each query label".
Well, the counters aren't data collected, they are data derived from the data they do collect. The privacy policy covers collection.
EDIT: I see they went out of their way to say "this is the complete list of everything we count" and they did not include counters by label, so I see your point!
I don't see how that is compatible with 2.2. They don't say anything about counters per label. It says counter per RR type, and watermarks of least and most recent timestamps by label, not count by label.
If an organization is going to be this specific about what they count, it implies that this is everything they count, not that there may also be other junk unmentioned.
I took a look at their privacy policy and agree that it doesn't specifically list that it logs which domains are being queried. It does list a bunch of things it does log as counters, all of which seems reasonable, but they don't explicitly say "we count which domains are being queried".
That said, I think it's entirely reasonable for them to log domains alone if they're completely disconnected from any user activity, i.e. a simple "increment the counter for foo.com" is reasonable since that's unrelated to user privacy.
Unless say, an adversary can link an obscure domain to a specific user/use case. Get that counter log and you can track a certain behavior (only pings this domain when about to do something or when on vacation, their house is empty, etc.)
One way around that is to set up a cron job that queries the most common domains one visits hourly. When requested by workstations and cell phones they will be served up by cache. At least that is what I have been doing for a few decades and works fine. I block all the DoH/DoT resolvers which is easier to do than some might think. One can do the individual A records or just the apex A/NS records to get infrastructure cache and then configure Unbound to prefetch records about to expire.
Just for fun I have added some of these into my cron job.
The average burglar probably isn’t cross-referencing DNS statistics.
4 replies →