Comment by jf93ap29sh
17 hours ago
If not built-in, you can probably put it together through Cloudflare itself.
If a request goes to the protected path, if detected as bot: hard HTTP redirect to the path set in the monetization gateway, if human: allow and don't redirect.
Is there actually a reliable way to differentiate human from bot?
As I understand it as models driving agent behavior of headless browsers are getting more and more sophisticated it's getting harder to reliably predict.
The same way LLM's without watermarking cannot be reliably classified as "not-human" neural-network driven scraping tools are getting harder to detect.
Cloudflare, and DataDome position themselves as companies that can detect automated traffic using things like IP reputation, behavioral signals, timing... But these things can be faked through proxy-networks, human behavior signals can be imitated with generative AI the same way text can be, web bots can utilize neural networks to generate trajectories and timings similar to those of humans.
If you can have an AI use a browser the same way a human can how can you distinguish the two?
There are reliable ways of differentiating human from cheap, bulk scraping bots.
But if the bot is advanced / expensive enough, it gets a lot harder. Where this product's market sits is in giving a paid way to access content compared to having to spin up bots that run js, from real IP addresses, etc. all of which are more expensive
Agreed. To me this feels like the perfect solution for websites and ai crawlers. Instead of having crawlers paying proxy services and captcha solvers, they can pay the website itself. As a web scraper, I'd happily pay the website provider to get access if it meant easy access to the content. Heck, as a human, I'd pay to avoid the dumb captchas.
[dead]