Comment by hamburglar

20 hours ago

If the “clawness” means you only use the llm to control itself, then yes, that’s impossible. But you can easily shim such a process so that the interfaces it uses to “claw out” to the real world are shims that have safeties such as human control. Openclaw does not do this, and is thus a scary shit show, but you can play with it in isolation safely, and I think a standard pattern for good control will emerge.

4 comments

hamburglar

yencabulator 19 hours ago

> easily

Yeah that's an active research topic for teams of PhDs, including some of Google's brightest. And the current approach even with added barriers may just be fundamentally untrustable. Read the links from my earlier comment for background.

hamburglar 11 hours ago
If the shim doesn’t use an LLM to make its decisions this is not a problem.
If the shim does use an LLM but no uncontrolled data is allowed in, this is not a problem.
- yencabulator 11 hours ago
  
  I think you're misunderstanding the severity of the lethal trifecta. Just because you put access controls around the LLM doesn't mean all that much if the access controls allow anything in & out. There is no way to write a shim that blocks "everything naughty", while remaining useful.
  You literally have to fully prevent all outside input, or you have to prevent all exfiltration routes including web page reading (even the choice of links to follow is an exfiltration mechanism). At that point, what's left? What do you think will be on your allowlist?
  I seriously doubt the early adopters of these software bundles use their assistants like with such restraint (https://xcancel.com/summeryue0/status/2025774069124399363), and that idealized image of these access control shims is not realistic.
  
  1 reply →