Comment by bilekas

16 hours ago

Am I understanding this correct in that you can basically automate monetizing your web/api content to everyone or just agents ? Because I would be very much in support of charging agents per request, but I would want to still offer humans a free experience.

Depends on the website though. I want LLMs to scrap my B2B website, because then it's shown to the user and they will likely use my product afterwards

I’m a PM on the team that is building this. We want to offer a range of options, from charging everyone to charging unverified bots to simply charging users who exceed rate limits. We don’t want to add a dependency on a particular detection mechanism, but we do want to offer a variety of choices depending on how people want to filter.

Feel free to email me at (my username)@(my company) with feature requests or feedback!

Their example of an /api/premium is quite nice! You could you like keep existing pages free, but provide specific output content for llm!

So if: cost monetized API < cost configuring scraper for your website OR feature provided by premium api > data got by scraping, then some people/business will likely pay

If not built-in, you can probably put it together through Cloudflare itself.

If a request goes to the protected path, if detected as bot: hard HTTP redirect to the path set in the monetization gateway, if human: allow and don't redirect.

  • Is there actually a reliable way to differentiate human from bot?

    • As I understand it as models driving agent behavior of headless browsers are getting more and more sophisticated it's getting harder to reliably predict.

      The same way LLM's without watermarking cannot be reliably classified as "not-human" neural-network driven scraping tools are getting harder to detect.

      Cloudflare, and DataDome position themselves as companies that can detect automated traffic using things like IP reputation, behavioral signals, timing... But these things can be faked through proxy-networks, human behavior signals can be imitated with generative AI the same way text can be, web bots can utilize neural networks to generate trajectories and timings similar to those of humans.

      If you can have an AI use a browser the same way a human can how can you distinguish the two?

    • There are reliable ways of differentiating human from cheap, bulk scraping bots.

      But if the bot is advanced / expensive enough, it gets a lot harder. Where this product's market sits is in giving a paid way to access content compared to having to spin up bots that run js, from real IP addresses, etc. all of which are more expensive

      1 reply →

Unless you have people's biometric data, you won't be able to separate agents from people. Except by payment.

  • Agents will be able to pay orders of magnitude more than humans, since they can just cache the documents at openai or anthropic, then use them over and over.

    • But then the cost to access a HTML page will also have to be thousands of dollars, since it can only be sold five times. (Once to Anthropic, once to OpenAI, once to Google, once to Meta and once to Apple).