Comment by KronisLV

7 months ago

> Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain

A centralized list like this not just for domains as a whole (e.g. co.uk) but also specific sites (e.g. s3-object-lambda.eu-west-1.amazonaws.com) is both kind of crazy in that the list will bloat a lot over the years, as well as a security risk for any platform that needs this functionality but would prefer not to leak any details publicly.

We already have the concept of a .well-known directory that you can use, when talking to a specific site. Similarly, we know how you can nest subdomains, like c.b.a.x, and it's more or less certain that you can't create a subdomain b without the involvement of a, so it should be possible to walk the chain.

Example:

  c --> https://b.a.x/.well-known/public-suffix
  b --> https://a.x/.well-known/public-suffix
  a --> https://x/.well-known/public-suffix

Maybe ship the domains with the browsers and such and leave generic sites like AWS or whatever to describe things themselves. Hell, maybe that could also have been a TXT record in DNS as well.

12 comments

KronisLV

IanCal 7 months ago

> any platform that needs this functionality but would prefer not to leak any details publicly.

I’m not sure how you’d have this - it’s for the public facing side of user hosted content, surely that must be public?

> We already have the concept of a .well-known directory that you can use, when talking to a specific site.

But the point is to help identify dangerous sites, by definition you can’t just let the sites mark themselves as trustworthy and rotate around subdomains. If you have an approach that doesn’t have to trust the site, you also don’t need any definition at the top level you could just infer it.

ramses0 7 months ago
It's actually exactly the same concept that come to mind for me. `SomeUser.geocities.com` is "tainted", along with `*.geocities.com`, so `geocities.com/.wellknown/i-am-tainted` is actually reasonable.
Although technically it might be better as `.wellknown/taint-regex` (now we have three problems), like `TAINT "*.sites.myhost.com" ; "myhost.com/uploads/*" ; ...`
- IanCal 7 months ago
  
  I think we disagree on the problem.
  The thing you want to avoid is this:
  a.scamsite.com gets blocked so they just put their phishing pages on b.scamsite.com
  The psl or your solution isn’t a “don’t trust subdomains” notification it’s “if one subdomain is bad, you should still trust the others” and the problem there is you can’t trust them.
  You could combine the two, but you still need the suffix list or similar curation.
  
  3 replies →

shadowgovt 7 months ago

It does smell very much like a feature that is currently implemented as a text file but will eventually need to grow to its own protocol, like, indeed, the hostfile becoming DNS.

One key difference between this list and standard DNS (at least as I understand it; maybe they added an extension to DNS I haven't seen) is the list requires independent attestation. You can't trust `foo.com` to just list its subdomains; that would be a trivial attack vector for a malware distributor to say "Oh hey, yeah, trustme.com is a public suffix; you shouldn't treat its subdomains as the same thing" and then spin up malware1.trustme.com, malware2.trustme.com, etc. Domain owners can't be the sole arbiter of whether their domain counts as a "public suffix" from the point of view of user safety.

m0dest 7 months ago

It looks like Mozilla does use DNS to verify requests to join the list, at least.

  $ dig +short txt _psl.website.one @1.1.1.1
  "https://github.com/publicsuffix/list/pull/2625"

Doing this DNS in the browser in real-time would be a performance challenge, though. PSL affects the scope of cookies (github.io is on the PSL, so a.github.io can't set a cookie that b.github.io can read). So the relevant PSL needs to be known before the first HTTP response comes back.

IshKebab 7 months ago

I presume it has to be a curated list otherwise spammers would use it to evade blocks. Otherwise why not just use DNS?

inopinatus 7 months ago
Whois would be the choice. DNS’s less glamourous sibling, purpose built for delegated publication of accountability records
- IshKebab 7 months ago
  
  Whois isn't curated either.
  
  1 reply →