← Back to context

Comment by KronisLV

4 months ago

> Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain

A centralized list like this not just for domains as a whole (e.g. co.uk) but also specific sites (e.g. s3-object-lambda.eu-west-1.amazonaws.com) is both kind of crazy in that the list will bloat a lot over the years, as well as a security risk for any platform that needs this functionality but would prefer not to leak any details publicly.

We already have the concept of a .well-known directory that you can use, when talking to a specific site. Similarly, we know how you can nest subdomains, like c.b.a.x, and it's more or less certain that you can't create a subdomain b without the involvement of a, so it should be possible to walk the chain.

Example:

  c --> https://b.a.x/.well-known/public-suffix
  b --> https://a.x/.well-known/public-suffix
  a --> https://x/.well-known/public-suffix

Maybe ship the domains with the browsers and such and leave generic sites like AWS or whatever to describe things themselves. Hell, maybe that could also have been a TXT record in DNS as well.

> any platform that needs this functionality but would prefer not to leak any details publicly.

I’m not sure how you’d have this - it’s for the public facing side of user hosted content, surely that must be public?

> We already have the concept of a .well-known directory that you can use, when talking to a specific site.

But the point is to help identify dangerous sites, by definition you can’t just let the sites mark themselves as trustworthy and rotate around subdomains. If you have an approach that doesn’t have to trust the site, you also don’t need any definition at the top level you could just infer it.

  • It's actually exactly the same concept that come to mind for me. `SomeUser.geocities.com` is "tainted", along with `*.geocities.com`, so `geocities.com/.wellknown/i-am-tainted` is actually reasonable.

    Although technically it might be better as `.wellknown/taint-regex` (now we have three problems), like `TAINT "*.sites.myhost.com" ; "myhost.com/uploads/*" ; ...`

    • I think we disagree on the problem.

      The thing you want to avoid is this:

      a.scamsite.com gets blocked so they just put their phishing pages on b.scamsite.com

      The psl or your solution isn’t a “don’t trust subdomains” notification it’s “if one subdomain is bad, you should still trust the others” and the problem there is you can’t trust them.

      You could combine the two, but you still need the suffix list or similar curation.

      3 replies →

It does smell very much like a feature that is currently implemented as a text file but will eventually need to grow to its own protocol, like, indeed, the hostfile becoming DNS.

One key difference between this list and standard DNS (at least as I understand it; maybe they added an extension to DNS I haven't seen) is the list requires independent attestation. You can't trust `foo.com` to just list its subdomains; that would be a trivial attack vector for a malware distributor to say "Oh hey, yeah, trustme.com is a public suffix; you shouldn't treat its subdomains as the same thing" and then spin up malware1.trustme.com, malware2.trustme.com, etc. Domain owners can't be the sole arbiter of whether their domain counts as a "public suffix" from the point of view of user safety.

It looks like Mozilla does use DNS to verify requests to join the list, at least.

  $ dig +short txt _psl.website.one @1.1.1.1
  "https://github.com/publicsuffix/list/pull/2625"

Doing this DNS in the browser in real-time would be a performance challenge, though. PSL affects the scope of cookies (github.io is on the PSL, so a.github.io can't set a cookie that b.github.io can read). So the relevant PSL needs to be known before the first HTTP response comes back.