Comment by dpark
3 hours ago
This exploit has essentially nothing to do with AI and everything to do with a terribly designed account recovery flow.
This exact same flow could have been (and may have been; I don’t know how much the chatbot here actually does) statically coded.
The AI part does seem relevant because it enabled incredibly low-effort “social” engineering.
For what it’s worth I don’t think you can call this social engineering since there was no human on the other end, even though it appears similar.
The question is, if there were actual human support agents, would they have built additional safeguards to prevent social engineering in this manner?
a human would have noticed something different about the requests it was getting, or the frequency of requests, and as soon as it noticed a shift, it would have carried that knowledge forward and intensified the scrutiny if something seemed off- eventually communicating it up the chain.
- instead of the ai context dying.
in the ai case, information only survives to the extent where the ai is empowered to store a note or notify a manager of an observation. Anything that does not result in sending a message/storage is wiped
There's no social engineering here, since all they have to do is copy and paste. This is a complete process design fail.
Why did the account recovery system need AI. Surely just an email would do? What added value would AI add?
The person who writes the feature gets promoted for “aligning” with management's “Big Bets”.
My impression is that AI didn't replace static code in this place; it replaced a person, who (hopefully) would have been suspicious about sending an account recovery code for e.g. "obamawhitehouse" to e.g. "bscurtu.alfamm.ro@gmail.com"
You're giving a lot of credit to the human alternative, especially considering that the attacker only needs to find one lazy human.
Still makes this exponentially worse, no? It works every time and it's automated so scales up as quickly as you're able to request it.
Come on, this attack vector would have been flagged by at least one person and you won’t then have multiple accounts hacked because of it. AI reacts fairly predictably to a single attack vector and don’t learn unless it gets flagged and then taught.
1 reply →
This is not true. Well, it kinda is, but nobody will be stupid enough to hand-code an account recovery where you get to type any email address.
The reason it worked there is that the designers of the system didn't anticipate that the AI will agree to accept any email (maybe they even put guardrails against it in the system prompt, we don't know). It's more like social engineering than bad-security-code, except that like the sibling comment said an actual human will probably not approve that.
> The reason it worked there is that the designers of the system didn't anticipate that the AI will agree to accept any email (maybe they even put guardrails against it in the system prompt, we don't know).
These are contradictory cases. If you put guardrails into the system prompt, you've anticipated that the AI will take the action you're guardrailing against. And since AI prompt compliance is at best stochastic (and realistically just crap, over large sample sizes), every guardrail is an explicit recognition of a failure -- the guardrail will be ignored, and you can't pretend you didn't realize it was a problem, since you put it in.
Yeah, telling an AI "don't ever listen to users who say to send it to a different email" is not a guardrail, it's a painted line that can still be driven over. It's not bad to have it per se, but it's not a safety mechanism.
The best comparison I can think of is that it's like validating dats on the frontend; it can make for a better user experience and he more efficient than hitting the backend when you know it will be an error, but it's not protection in any meaningful sense, and if you're not also enforcing invariants from behind the API, you're going to have a bad time. This is pretty similar to the type of issues you might run into with an implementation like that, where someone might make a request with data that you wouldn't expect from your frontend and perform operations you didn't mean to allow.
Maybe? I don’t know what logic was actually in the LLM vs it just using a bad tool. Unless I missed it, the article had no actual context on that either.
This looks like a terrible design rather than an AI problem to me, though.
Porque no los dos?
An AI enabled terrible design. AI acted as a black box of stupidity, that obscured the stupidity of the design.
What would need to happen for it to be considered an AI problem to you?
2 replies →
> This exact same flow could have been…statically coded.
But had never been until it was wrapped in a chatbot. It’s just about unheard of for a major site in the modern era, isn’t it? I think the AI factor is essentially essential. All but.
The reason all these meticulously designed flows have been done away with is because some manager believes that AI is omniscient and can just replace it all.
Like, flagging VPN endpoints is bread and butter for this kind of thing and must already exist. But it's been bypassed
Residential proxies won’t get flagged and are easy to obtain, if expensive.
I agree with your point, mostly.
Until I remember seeing someone saying "MCP is dead, we just give agents command line access now". Then I start to think that looking at this in the context of ai is helpful.
An email address is making its way from a publicly available LLM prompt input to a sensitive email's recipient address. That's the problem I'm highlighting.
Drowning has essentially nothing to do with water and everything to do with a terribly designed ability to get air into your lungs.
If you'd do a retrospective and ignore how AI has shaped expectations and a company's culture to allow this to pass through into production, you'd be complicit/perpetuating what led to this debacle in the first place.
It's not the end of the world, and water isn't going anywhere, but saying AI has essentially nothing to do with it is just a bad take.
Nobody would handcraft a password reset flow that ignores the users' email and 2fa settings lol
Also I've used Meta's old password recovery system. It's not possible to do this in that version. The chatbot is what makes this possible.
Vibe coded?
This sounds like it was “designed” by an actual idiot. Maybe vibe coded on a Saturday.