We just need a browser plugin to auto-email webmasters to request access, and wait for the follow-up "access granted" email. It could be powered by AI.
1. A proxy that looks at HTTP Headers and TLS cipher choices
2. An allowlist that records which browsers send which headers and selects which ciphers
3. A dynamic loading of the allowlist into the proxy at some given interval
New browser versions or updates to OSs would need the allowlist updating, but I'm not sure it's that inconvenient and could be done via GitHub so people could submit new combinations.
I'd rather just say "I trust real browsers" and dump the rest.
Also I noticed a far simpler block, just block almost every request whose UA claims to be "compatible".
Everything on this can be programmatically simulated by a bot with bad intentions. It will be a cat and mouse game of finding behaviors that differentiate between bot and not and patching them.
To truly say “I trust real browsers” requires a signal of integrity of the user and browser such as cryptographic device attestation of the browser. .. which has to be centrally verified. Which is also not great.
> Everything on this can be programmatically simulated by a bot with bad intentions. It will be a cat and mouse game of finding behaviors that differentiate between bot and not and patching them.
Forcing Facebook & Co to play the adversary role still seems like an improvement over the current situation. They're clearly operating illegitimately if they start spoofing real user agents to get around bot blocking capabilities.
If you mean user-agent-wise, I think real users vary too much to do that.
That could also be a user login, maybe, with per-user rate limits. I expect that bot runners could find a way to break that, but at least it's extra engineering effort on their part, and they may not bother until enough sites force the issue.
I love (hate) the idea of a site where you need to send a personal email to the webmaster to be whitelisted.
We just need a browser plugin to auto-email webmasters to request access, and wait for the follow-up "access granted" email. It could be powered by AI.
Then someone will require a notarized statement of intent before you can read the recipe blog.
2 replies →
I have not heard the word "webmaster" in such a long time
Deliberately chosen for the nostalgia value :)
I have thought about writing such a thing...
1. A proxy that looks at HTTP Headers and TLS cipher choices
2. An allowlist that records which browsers send which headers and selects which ciphers
3. A dynamic loading of the allowlist into the proxy at some given interval
New browser versions or updates to OSs would need the allowlist updating, but I'm not sure it's that inconvenient and could be done via GitHub so people could submit new combinations.
I'd rather just say "I trust real browsers" and dump the rest.
Also I noticed a far simpler block, just block almost every request whose UA claims to be "compatible".
Everything on this can be programmatically simulated by a bot with bad intentions. It will be a cat and mouse game of finding behaviors that differentiate between bot and not and patching them.
To truly say “I trust real browsers” requires a signal of integrity of the user and browser such as cryptographic device attestation of the browser. .. which has to be centrally verified. Which is also not great.
> Everything on this can be programmatically simulated by a bot with bad intentions. It will be a cat and mouse game of finding behaviors that differentiate between bot and not and patching them.
Forcing Facebook & Co to play the adversary role still seems like an improvement over the current situation. They're clearly operating illegitimately if they start spoofing real user agents to get around bot blocking capabilities.
1 reply →
This is Cloudflare with extra steps
If you mean user-agent-wise, I think real users vary too much to do that.
That could also be a user login, maybe, with per-user rate limits. I expect that bot runners could find a way to break that, but at least it's extra engineering effort on their part, and they may not bother until enough sites force the issue.