Comment by kccqzy
3 days ago
As much as I hate to say it, the fact that the attacks are “known issues” seems well known in the industry among people who care about security and LLMs. Even as an occasional reader of your blog (thank you for maintaining such an informative blog!), I know about the lethal trifecta and the exfiltration risks since early ChatGPT and Bard.
I have previously expressed my views on HN about removing one of the three lethal trifecta; it didn’t go anywhere. It just seems that at this phase, people are so excited about the new capabilities LLMs can unlock that they don’t care about security.
I have a different perspective. The Trifecta is a bad model because it makes people think this is just another cybersecurity challenge, solvable with careful engineering. But it's not.
It cannot be solved this way because it's a people problem - LLMs are like people, not like classical programs, and that's fundamental. That's what they're made to be, that's why they're useful. The problems we're discussing are variations of principal/agent problem, with LLM being the savant but extremely naive agent. There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.
You're not explaining why the trifecta doesn't solve the problem. What attack vector remains?
None, but your product becomes about as useful and functional as a rock.
6 replies →
>There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.
Well when talking about employees etc, one model to protect against malicious employees is to require every sensitive action (code check in, log access, prod modification) to require approval from a 2nd person. That same model can be used for agents. However, agents, known to be naive, might not be a good approver. So having a human approve everything the agent does could be a good solution.
Then, the goal must be to guide users to run Antigravity in a sandbox, with only the data or information that it must access.