Comment by krisman
2 months ago
Am curious why most comments ignored the fact that Claude autonomously ignored its guardrails & issued a DELETE? This WILL happen across all transformer based LLMs. We aren't waiting for sh*t to happen-we have HiTL with client side h/w attested auth to confirm such actions. No static policies would've caught this-so, we built dynamic decision making to trigger gating. Read Google Research's paper "AI Agent Traps" to get an idea of the scope of the problems.
No comments yet
Contribute on Hacker News ↗