Comment by ArcHound
5 days ago
I'm sorry, but the rule of two is just not enough, not even as a rule of thumb.
We know how to work with security risks, the issue is they depend both on the business and the technicalities.
This can actually do a lot of harm as security now needs to dispel this "great approach" to ignoring security that is supported by a "research paper they read".
Please don't try to reinvent the wheel and if you do, please learn about the current state (Chesterton's fence and all that).
Can you explain what you mean? How is Chesterton's fence applied to AI security helpful here? Are you just talking about not removing the "Non-AI" security architecture of the software itself? I think no one ever proposed that?
Right, what got me going is the reduction of plenty cyber security concepts into a simple "safe" label in the diagram.
So what I meant is that before you discard all of the current security practices, it's better to learn about the current approach.
From another angle, maybe the diagram could be fixed with changing "safe" to "danger" and "danger" to "OMG stop". But that also discards the business perspective and the nature of the protected asset.
I am also happy to see the edit in the article, props to the author for that!
And to address the last question, no one proposed that right now, yes. But I was in plenty of discussions about security approaches. And let me tell you, sometimes it only takes one sentence that the leadership likes to hear to detail the whole approach (especially if it results in cost savings). So I might be extra sensitive to such ideas and I try to uproot them before they bloom fully.
Hmm, what do you mean by current approach? This is new territory and agent safety is an unsolved problem, there is no current approach, except you mean not doing agent systems and using humans. The trifecta is just a tool on the level of physics saying "ignore friction", we assume the model itself is trustworthy and not poisoned most of the time too, but of course when designing a real world system you need to factor that in too.
1 reply →