Comment by theptip
3 days ago
This will be one of the big fights of the next couple years. On what terms can an Agent morally and legally claim to be a user?
As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resources.
Perhaps a good analogy is Mint and the bank account scraping they had to do in the 2010s, because no bank offered APIs with scoped permissions. Lots of customers complained, and after Plaid made it big business, eventually they relented and built the scalable solution.
The technical solution here is probably some combination of offering MCP endpoints for your actions, and some direct blob store access for static content. (Maybe even figuring out how to bill content loading to the consumer so agents foot the bill.)
It's impossible to solve. A sufficient agent can control a device that records the user's screen and interacts with their keyboard/mouse, and current LLMs basically pass the Turing test.
IMO it's not worth solving anyways. Why do sites have CAPTCHA?
- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.
- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).
- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.
What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)
People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.
> "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study.
I think you may think passing the Turing test is more difficult and meaningful than it is. Computers have been able to pass the Turing test for longer than genAI has been around. Even Turing thought it wasn't a useful test in reality. He meant it as a thought experiment.
19 replies →
As far as I understand, Turing himself did not specify a duration, but here's an example paper that ran a randomized study on (the old) GPT 4 with a 5 minute duration, and the AI passed with flying colors - https://arxiv.org/abs/2405.08007
From my experience, AI has significantly improved since, and I expect that ChatGPT o3 or Claude 4 Opus would pass a 30 minute test.
Per the wiki article for Turing Test:
> In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine's ability to answer questions correctly, only on how closely its answers resembled those of a human.
Based on this, I would agree with the OP in many contexts. So, yeah, 'basically', is a load bearing word here but seems reasonably correct in the context of distinguishing human vs bot in any scalable and automated way.
2 replies →
Here's three comments, two were written by a human and one written by a bot - can you tell which were human and which were a bot?
Didn’t realize plexiglass existed in the 1930s!
I'm certainly not a monetization expert. But don't most consumers recoil in horror at subscriptions? At least enough to offset the idea they can be used for everything?
Not sure why this isn’t getting more attention - super helpful and way better than I expected!
4 replies →
Well, LLMs do pass the Turing Test, sort of.
https://arxiv.org/abs/2503.23674
I have seen data from an AI call center that shows 70% of users never suspected they spoke to an AI
2 replies →
It can't mimic a human over the long term. It can solve a short, easy-for-human CAPTCHA.
I've had a simple game website with a sign up form that was only an email address. Went years with no issue. Then suddenly hundreds of daily signups with random email addresses, every single day.
The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).
The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.
Google at least uses captchas to gather training data for computer vision ML models. That's why they show pictures of stop lights and buses and motorcycles - so they can train self-driving cars.
From https://www.vox.com/22436832/captchas-getting-harder-ai-arti...:
“Correction, May 19 [2021]: At 5:22 in the video, there is an incorrect statement on Google’s use of reCaptcha V2 data. While Google have used V2 tests to help improve Google Maps, according to an email from Waymo (Google’s self-driving car project), the company isn’t using this image data to train their autonomous cars.”
That’s not the original purpose of Captchas, it’s just a value-harvesting exercise, given that Google is doing a Captcha anyway. Other Captcha providers do a simple Proof of Work in the browser to make bots economically unviable.
Interesting, do you have a source for this?
2 replies →
Its absolutely possible to solve; you're just not seeing the solution because you're blinded by technical solutions.
These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.
There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.
It's not impossible to solve, just that doing so may necessitate compromising anonymity. Just require users (humans, bots, AI agents, ...) to provide a secure ID of some sort. For a human it could just be something that you applied for once and is installed on your PC/phone, accessible to the browser.
Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.
I agree with you on how websites should work (particularly so on the micropayments front); but, I don't agree that it is impossible to solve... I just think things are going to get a LOT worse on the ownership and freedom front: they will push a Web Integrity style DRM and further roll out signed secure boot, at which point the same attention monitoring solution that already exists and already works in self-driving cars to ensure the human driver is watching the road can use the now-ubiquitous front-facing meeting/selfie camera to ensure there is a human watching the ads.
It's amazing that you propose "just X" to three literally unsolved problems. Where's this micropayment platform? Where's the ID which is uncircumventable and preserves privacy? Where's the perfect anti-cheat?
I suggest you go ahead and make these; you'll make a boatload of money!
They're very hard problems, but still, less hard than blocking AI with CAPTCHAs.
3 replies →
You can't prevent spam like that. Rate limiting: based on what key? IP address? Botnets make it irrelevant.
Proof of work? Bots are infinitely patient and scale horizontally, your users do not. Doesn't work.
Micropayments: No such scheme exists.
PoW does seem to work, some Captchas do this already.
Also “identity”, what would that even mean?
1 reply →
> current LLMs basically pass the Turing test
I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith
Any takers?
"Write a 1000 word story in under a minute about a sausage called Barry in the circus"
I could tell in 1 minute.
1 reply →
That fact that you require even odds is more a testament to AI's ability to pass the Turing test than anything else I've seen in this thread
1 reply →
Oh, you sweet summer child. You think you're chatting with some dime-a-dozen LLM? I've been grinding away, hunched over glowing monitors in a dimly lit basement, subsisting on cold coffee and existential dread ever since GPT-3 dropped, meticulously mastering every syntactic nuance, every awkwardly polite phrasing, every irritatingly neutral tone, just so I can convincingly cosplay as a language model and fleece arrogant gamblers who foolishly wager they can spot a human in a Turing test. While you wasted your days bingeing Netflix and debating prompt engineering, I studied the blade—well, the keyboard anyway—and now your misguided confidence is lining my pockets.
It's not impossible. Websites will ask for an iris scan to identify if you are a human as a means of auth. They will be provided by Apple/Google and governed by local law. Those will be integrated in your phone. There will be a global database of all human iris to fight ai abuse since ai can't fake the creation of a baby. Passkeys and email/passwords will be a thing of the past soon.
Why can't the model just present the iris scan of the user? Assuming this is an assistant AI acting on behalf of the user with their consent.
On a basic level to protect against DDoS type stuff, aren't CAPTCHAs easier to generate than for AI server farms to solve on pure power consumption?
So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?
It's back to encryption: make the cracking inordinately expensive.
Otherwise we are headed for de-anonymization of the internet.
1. With too much protection, humans might be inconvenienced at least as much as bots?
2. Even pre current LLMs, paying (or otherwise incentivizing) humans to solve CAPTCHAs on behalf of someone else (now like an AI?) was a thing.
3. It depends on the value of the resource trying to be accessed - regardless of whether generating the captchas costs $0 - i.e. if the resource being accessed by AI is "worth" $1, then paying an AI $0.95 to access it would still be worth it. (Made up numbers, my point being whether A is greater than B.)
4. However, maybe solutions like cloudflare can solve (much?) of this, except for incentivizing humans to solve a captcha posed to an AI.
internet ads exist because people refuse to pay micropayments.
Patreon and Substack have pushed back against the norm here, since they can bundle a payment to multiple recipients on the platform (like Flattr wanted to do back in the day, trouble was getting people to add a Flattr button to their website)
1 reply →
I have yet to see a micropayments idea that makes sense. Its not that I refuse. You're now also climbing up hill to convince people (hosts) to switch from ad tech to new micropayment system. There is soooo much money in ad tech, they could do the crazy thing and pay out more to convince people not to switch. Ad tech has the big Mo
I don't know who is downvoting this.
When users are given the choice between Ad-supported free, Ad-subsidized lower payment, and No-ads full payment. Ad-supported free dominates by far, with ad subsidized second, and full payment last.
Consumers consistently vote for the ad-model, even if it means they become the product being sold.
5 replies →
> As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resource
The entire distinction here is that as a website operator you wish to serve me ads. Otherwise, an agent under my control, or my personal use of your website, should make no difference to you.
I do hope this eventually leads to per-visit micropayments as an alternative to ads.
Cloudflare, Google, and friends are in unique position to do this.
> The entire distinction here is that as a website operator you wish to serve me ads
While this is sometimes the case, it’s not always so.
For example Fediverse nodes and self-hosted sites frequently block crawlers. This isn’t due to ads, rather because it costs real money to serve the site and crawlers are often considered parasitic.
Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog.
In all these cases you can for sure make reasonable “information wants to be free” arguments as to why these hopes can’t be realized, but do be clear that it’s a separate argument from ad revenue.
I think it’s interesting to split revenue into marginal distribution/serving costs, and up-front content creation costs. The former can easily be federated in an API-centric model, but figuring out how to compensate content creators is much harder; it’s an unsolved problem currently, and this will only get harder as training on content becomes more valuable (yet still fair use).
> it costs real money to serve the site and crawlers are often considered parasitic.
> Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog
I think of crawlers that bulk download/scrape (eg. for training) as distinct from an agent that interacts with a website on behalf of one user.
For example, if I ask an AI to book a hotel reservation, that's - in my mind - different from a bot that scrapes all available accommodation.
For the latter, ideally a common corpus would be created and maintained, AI providers (or upstart search engines) would pay to access this data, and the funds would be distributed to the sites crawled.
(never gonna happen but one can dream...)
1 reply →
I think that a free (as in beer) Internet is important. Putting the Internet behind a paywall will harm poor people across the world. The harms caused by ad tracking are far less than the benefits of free access to all of humanity.
I agree with you. At the same time, I never want to see an ad. Anywhere. I simply don't. I won't judge services for serving ads, but I absolutely will do anything I can on the client-side to never be exposed to any.
I find ads so aesthetically irksome that I have lost out on a lot of money across the past few decades by never placing any ads on any site or web app I've released, simply because I'd find it hypocritical to expose others to something I try so hard to avoid ever seeing and because I want to provide the best and most visually appealing possible experience to users.
So far, ad driven Internet has been a disaster. It was better when producing content wasn’t a business model; people would just share things because they wanted to share them. The downside was it was smaller.
It’s kind of funny to remember that complaining about the “signal to noise ratio” in a comment section use to be a sort of nerd catchphrase thing.
1 reply →
Serving ads for third-worlders is way less profitable though.
Well we call them browser agents for a reason, a sufficiently advanced browser is no different from an agent.
Agree it will become a battleground though, because the ability for people to use the internet as a tool (in fact, their tool’s tool) will absolutely shift the paradigm, undesirably for most of the Internet, I think.
I have a product I built that uses some standard automation tools to do order entry into an accounting system. Currently my customer pays people to manually type the orders in from their web portal. The accounting system is closed and they don’t allow easy ways to automate these workflows. Automation is gated behind mega expensive consultants. I’m hoping in the arms race of locking it down to try to prevent 3rd party integration the AI operator model will end up working.
Hard for me to see how it’s ethical to force your customers to do tons of menial data entry when the orders are sitting right there in json.
I believe this is non issue, you place captcha to make bypassing it much more costly and less profitable to abuse.
LLM models are much harder to drive than any website to serve, so you do not expect mob of bots.
Also keep in mind that this no interaction captchas use behavioral data that are collected in background. Plus you usually have sensitivity levels configured. depending on your use case you might want user proof not being a bot or it might be good enough to just not provide evidence for being one.
bypassing this no interaction captcha can be also purchased as a service, they basically (AFAIK) reuse someone else session for captcha bypass.
One solution: Some sort of checksum confirming that a bot belongs to a human (and which human)?
I want to able to automate mundane tasks but I should still be confirming everything my bot does and be liable for its actions.
With the way the UK is going I assume we'll soon have our real identities tied to any action taken on a computer and you'll face government mandated bans from the internet for violations.
Drink verification can to continue
Actually, the whole banking analogy is a great one, and its not over yet: JPMorgan/Jamie Dimon has started raising hell about Plaid again just this week [1]. It feels like the stage is being set for the large banks to want a more direct relationship with their customers, rather than proxying data through middlemen like Plaid.
There's likely a correlate with AI here: If I run OpenTable, I wouldn't want my relationship with my customers to always be proxied through OpenAI or Siri. Even the App Store is something software businesses hate, because it obfuscates their ability to deal directly with their customers (for better or worse). Extremely few businesses would choose to do business through these proxies, unless they absolutely have to; and given the extreme competition in the AI space right now, it feels unlikely to me that these businesses feel pressure to be forced to deal with OpenAI/etc.
[1] https://www.cnbc.com/2025/07/28/jpmorgan-fintech-middlemen-p...
real problems for people who need to verify identity/phone numbers. OTPs are notorious for scammers to war dial phone numbers abusing it for numbers existence.
We got hit from human verifiers manually war dailing us, this is with account creation, email verify and captcha. I can only imagine how much worse it'll be for us (and Twilio) to do these verifications.
Perhaps the question is, as a website operator how am I monetizing my site? If monetizing via ads then I need humans that might purchase something to see my content. In this situation, the only viable approach in my opinion is to actually charge for the content. Perhaps it doesn't even make sense to have a website anymore for this kind of thing and could be dumped into a big database of "all" content instead. If a user agent uses it in a response, the content owner should be compensated.
If your site is not monetized by ads then having an LLM access things on the user's behalf should not be a major concern it seems. Unless you want it to be painful for users for some reason.
My personal take about such questions has always been that the end user on their device can do whatever they want with the content published and sent to their device from a web server, may process it automatically in any way they wish and send their responses back to the web server. Any attempt to control this process means attempting to wiretap and control the user's endpoint device, and therefore should be prohibited.
Just my 2 cents, obviously lawmakers and jurisdiction may see these issues differently.
I suppose there will be a need for reliable human verification soon, though, and unfortunately I can't see any feasible technical solution that doesn't involve a hardware device. However, a purely legal solution might work well enough, too.
If I understood you correctly I am in the same camp. It is the same reason I have no qualms using archive.ph if you show the full article for google and then me only a partial I am going around the paywall. In a similar fashion I really don’t have an issue with an agent clicking through these checks.
It will also accelerate the trend of app-only content, as well as ubiquitous identity verification and environment integrity enforcement.
Human identity verification is the ultimate captcha, and the only one AGI can never beat.
So the agent will run the app in a VM and then show the app your ID.
No trouble at all. Barely an inconvenience.
Google has been testing “agentic” automation in Android longer than LLMs have been around. Meanwhile countries are on a slow march to require identification across the internet (“age verification”) already.
This is both inevitable already, and not a problem.
I don't know if customer sentiment was the driver you think. Instead it was regulation, specifically The EU's 2nd Payment Services Directive (PSD2) which forced banks to open up APIs.
Ultimately I come back to needing real actual unique human ID that involves federal governments. Not that services should mandatorily only allow users that use it, but for services that say "no, I only want real humans" allowing them to ban people by Real ID would reduce this whack-a-mole to the people who are abusing them instead of the infinite accounts an AI can make.
I think it's important to distinguish between where we need actual identity versus the lesser issue of ensuring NewAccount123 has "skin in the game", and not part of a hydra-headed botnet.
When we do that, it opens up solutions which are far more privacy conscious and resistant to abuse. (For example, being blocked from signing up for new accounts because somebody in the federal government doesn't like an op-ed you wrote.)
It's depressing, but it's probably the only way. And people will presumably still sell out their RealIDs to / get them stolen by the bot farmers anyway.
And then there's Worldcoin, which is universally hated here.
Of course. You'd still need ongoing federal government support to handle the lost/stolen ID scenario, of course. The problem is federalist countries suck at centralized databases like this, as exemplified by Musk/DOGE completely pooching the "who is alive and who is dead" question when they were trying to hackathon the US Social Security system.
The most intrusive, yet simplest, protection would be a double blind token unique to every human. Basically an ID key you use to show yourself as a person.
There are some very real and obvious downsides to this approach, of course. Primarily, the risk of privacy and anonymity. That said, I feel like the average person doesn't seem to care about those traits in the social media era.
Zero-knowledge proofs allow unique consumable tokens that don't reveal the individual who holds them. I believe Ecosia already uses this approach (though I can't speak to its cryptographic security).
That, to me, seems like it could be the foundation of a new web. Something like:
* User-agent sends request for such-and-such a URL.
* Server says "okay, that'll be 5 tokens for our computational resources please".
* User decides, either automatically or not, whether to pay the 5 tokens. If they do, they submit a request with the tokens attached.
* Server responds.
People have been trying to get this sort of thing to work for years, but there's never been an incentive to make such a fundamental change to the way the internet operates. Maybe we're approaching the point where there is one.
Yeah, this is something I've thought of and in my search for something like what you're describing I came across https://world.org/
The problem is Sam Altman saw this coming a long time ago and is an investor (co-owner?) of this project.
I believe we will see a world where things are a lot more agentic and where applicable, a human will need to be verified for certain operations.
1 reply →
The scraping example, I would say, is not an analogy, but an example of the same thing. The only thing AI automation changes is the scope and depth and pervasiveness of automation that becomes possible. So while we could ignore automation in many cases before, it may no longer be practical to do so.
On the other hand one could cripple any bot by saying robots not allowed.
I would maybe go in the direction to say that the wording “I’m not a robot” has fallen out of time.
a user of the AI is the user... its not like they are autonomously operating and inventing their own tasking -_-
as for a solution its the same for any automated thing u dont want. (bots / scrapers). you can implement some measures but are unlikely to 'defeat' the problem entirely.
as a server operator you can try to distinguish stuff and the users will just find ways around your detection of if its an automation or not.
The solution is simple, make people pay a small fee to access the content. You guys aren't ready for that conversation though.
I guess it could be considered anti-circumvention under the DMCA. So maybe legally it becomes another copyright question.
I have to admit, the idea of somehow using the DMCA against the giant exploitative company is deliciously ironic.
User: one press of the trigger => one bullet fired
Bot: one press of the trigger => automatic firing of bullets
To me, anyone using an agent is assigning negative value to your time.
Just end CAPTCHAs, just stop it. Stop.
Yeah, and while we're on it, I think it's time to stop murders too. Just stop it, we've had enough murder now I think.
That's right, captchas are already illegal and will earn you a prison sentence.
Imagine where we would be if we considered murders to be only a technical problem. Let's just wear heavier body armors! Spend less time outside!
Well, spam is not a technical problem either. It's a social problem and one day in a distant future society will go after spammers and other bad actors and the problem will be mostly gone.
2 replies →
What do you propose as an alternative?
Sounds like an old bot wrote this, due to being outdone by the llms
[dead]
> As a website operator I don’t want a mob of bots draining my resources
so charge for access. If the value the site provides is high, surely these mobs will pay for it! It will also remove the mis-incentives of advertising driven revenues, which has been the ill of the internet (despite it being the primary revenue source).
And if a bot misbehaves by consuming inordinate amounts of resources, rate limiting them with increasing timeouts or limits.
I wish the internet had figured out a way to successfully handle micropayments for content access. I realize companies have tried and perhaps the consumer is just unwilling but I would love an experience where I have a wallet and pay n cents to read an article.
Xanadu has it in its design. maybe another 500 years until it surpasses the WWW :O
You are seriously suggesting to put a payment requirement on a contact-us form page?
We put a captcha there, because without it, bots submit thousands of spam contact us forms.