Comment by codingdave
2 days ago
Yes, and it has been said since day one of LLMs that all we need to do is keep things that way - no action without human intervention. Just like it was said that you should never grant AI direct access to change your production systems. But the stories of people who have done exactly that and had their systems damaged and deleted show that people aren't trying to even keep such basic safety nets in place.
AI is getting strong enough that if people give some general direction as well as access to production systems of any kind, things can go badly. It is not true that all implementations of agentic AI requires human intervention for all action.
My cynical rule of thumb: By default we should imagine LLMs like javascript logic offloaded into a stranger's web-browser.
The risks are similar: No prompts/data that go in can reliably be kept secret; A sufficiently-motivated stranger can have it send back completely arbitrary results; Some of those results may trigger very bad things depending on how you use or even just display them on your own end.
P.S. This conceptual shortcut doesn't quite capture the dangers of poison data, which could sabotage all instances even when they happen to be hosted by honorable strangers.
If you had made a tool that gave gpt-3 the ability to run arbitrary commands on your production systems you could have seen things go badly.
Good news! Today's SOTA models can also make things go badly.
Yep. I don’t see how that metric indicates how… strong(?) a language model is.
Eh, these same people will attach openclaw to production systems soon and destroy their own companies.
The problem is, out of ten companies who take this approach, nine will indeed destroy themselves and one will end up with a trillion-dollar market cap. It will outcompete hundreds of companies who stuck with more conservative approaches. Everybody will want to emulate company #10, because "it obviously works."
I don't see any stabilizing influences on the horizon, given how much cash is sloshing around in the economy looking for a place to land. Things are going to get weird, stupid, and chaotic, not necessarily in that order.
One does not even need OpenClaw to achieve this outcome: https://x.com/lifeof_jer/status/2048103471019434248
Yeeeehaaaaa, the vibes shall never end!
On a more serious note, they were mostly f*cked by their paas provider imo. Claude will always do dumb shit. Especially if you tell it to not do something... By doing so you generally increase the likelihood of it doing it.
It's even obvious why if you think about it, the pattern of "you had one job, but you failed" or "only this can't happen, it happened!" And all it's other forms is all over literature, online content etc.
But their PaaS provider not scoping permissions properly is the root cause, all things considered. While Claude did cause this issue there, something else would've happened eventually otherwise.
1 reply →
Sounds like a pretty efficient self correcting mechanism
I’m not sure what the problem is there
The problem is that destruction isn't contained to the company. If an AI agent exposes all company data and that includes PII or health information, that could have an impact on a large number of people.
1 reply →
Normalisation of deviance is the problem: https://en.wikipedia.org/wiki/Normalization_of_deviance
Remember that these models are getting better; this means they get trusted with increasingly more important things by the time an error explodes in someone's face.
It would be very bad if the thing which explodes is something you value which was handed off to an AI by someone who incorrectly thought it safe.
AI companies which don't openly report that their AI can make mistakes are being dishonest, and that dishonesty would make this normalization of deviance even more prevelant than it already is.
3 replies →