Comment by crgwbr
11 days ago
All this is going to do is drive AI companies to mask their user agent to appear as a standard browser, resulting in a worse end state than we’re in now. It’s an exercise in futility.
11 days ago
All this is going to do is drive AI companies to mask their user agent to appear as a standard browser, resulting in a worse end state than we’re in now. It’s an exercise in futility.
The blog post covers this. The announcement also drops relying on spoofable user agents for crawler identification and requires crawlers to voluntarily identify themselves via RFC 9421 cryptographic message signatures to get access: https://blog.cloudflare.com/introducing-pay-per-crawl/#payme...
There are likely incentives for AI companies to try to simulate human users as much as possible, but the value proposition here is that CF is so good at identifying and stopping those that signing a request becomes the path of least resistance.
Disclosure: I am on the team that wrote the RFC 9421 message signature implementation at Cloudflare and its use in the pay per crawl project. A separate blog post went out here: https://blog.cloudflare.com/verified-bots-with-cryptography/
Potentially for smaller players but I'd guess that the larger players (OpenAI, Anthropic, etc) won't go down that line as it'd be pretty easy to spot at the volume they're crawling and a bad look for them when they inevitably get discovered.
Also, Cloudflare is in the position of being able to see a lot of traffic making it easier for them to spot that kind of masking activity.
Weren’t they already doing that for years (plus using residential proxies)?
if it's cheaper than the proxies they might switch!
This could get AI scrapers hit with a DMCA circumvention lawsuit, which is $2,500 / scrape + attorney fees of both sides if they lose.
In theory, this presents a form of competition that should drive the tolls down to an equilibrium level. Though, theory doesn’t always play out perfectly in practice.
I agree, it's only the big tech companies who do this AI crawling, and they will always have money for it. This paywall won't stop them.
Yes, but I do feel this makes "theft" arguments stronger if they're deliberately evading the paywall, if you decided to be litigious about it.
This isn't the kind of problem that really ought to be solved through courts. It's obvious to anyone that this is a new kind of problem that no author of the current jurisprudence envisioned. We need new legislation to stop this kind of abuse of the commons.
I strongly agree with you, but I have no confidence in my country's current elected representatives to ever do anything good, so our hands are tied until we vote them out.
Yes. It's always weird to me that people expect laws written over centuries, using precedents from even more centuries, to be able cover scenarios their authors couldn't have possibly imagined.
Civil law countries seem better at keeping their laws up to date with new threats whereas a few common law ones (most notably the US) really insist on digging through what an 18th century slave owner would have thought about e.g. AI.