Comment by TeMPOraL
3 days ago
I have a different perspective. The Trifecta is a bad model because it makes people think this is just another cybersecurity challenge, solvable with careful engineering. But it's not.
It cannot be solved this way because it's a people problem - LLMs are like people, not like classical programs, and that's fundamental. That's what they're made to be, that's why they're useful. The problems we're discussing are variations of principal/agent problem, with LLM being the savant but extremely naive agent. There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.
You're not explaining why the trifecta doesn't solve the problem. What attack vector remains?
None, but your product becomes about as useful and functional as a rock.
This is what reasonable people disagree on. My employer provides several AI coding tools, none of which can communicate with the external internet. It completely removes the exfiltration risk. And people find these tools very useful.
5 replies →
>There is no probable, verifiable solution here, not any more than when talking about human employees, contractors, friends.
Well when talking about employees etc, one model to protect against malicious employees is to require every sensitive action (code check in, log access, prod modification) to require approval from a 2nd person. That same model can be used for agents. However, agents, known to be naive, might not be a good approver. So having a human approve everything the agent does could be a good solution.