Comment by giancarlostoro

19 hours ago

In my opinion Claude should be shipped by a custom implementation of "rm" that Anthropic can add guardrails to. Same with "find" surprised they don't just embed ripgrep (what VS Code does). It's really surprising they don't just tweak what Claude uses and lock it down to where it cannot be harmful. Ensure it only ever calls tooling Claude Code provides.

28 comments

giancarlostoro

nananana9 12 hours ago

Oh, rm failed, since we're running in a weird environment! Let me retry with `bash -c "/usr/bin/rm -rf *"`!

throwaway2027 15 hours ago

All of which is useless when it just starts using big blocks of python instead. You need filesystem sandboxing for the python interpreter too.

giancarlostoro 4 hours ago
If you disallow it from just writing Python scripts to bypass its defined environment at its core system training why would this matter? I would lockdown its path anything that tries to call Python should require the end-user to approve and see the raw script before they do.
- tintor 4 hours ago
  
  It will then write script in some other language, as a workaround.
ethanwillis 15 hours ago
What we need is a capabilities based security system. It could write all the python, asm, whatever it wants and it wouldn't matter at all if it was never given a reference to use something it shouldn't.
- ma2kx 6 hours ago
  
  There exist restricted Shells. But honestly, I don't feel capable of assessing all attack vectors and security measures in sufficient detail. For example, do the rbash restrictions also apply when Python is called with it? Or can the agent somehow bypass rbash to call Python?
  https://en.wikipedia.org/wiki/Restricted_shell
- mcv 14 hours ago
  
  Isn't this already possible? Give it its own user account with write access to the project directory and either read access or no access outside it.
  
  3 replies →
- rienbdj 10 hours ago
  
  Docker is enough in practice no?
  
  1 reply →
- diablevv 10 hours ago
  
  [dead]

lxgr 13 hours ago

> a custom implementation of "rm" that Anthropic can add guardrails to

Wrong layer. You want the deletion to actually be impossible from a privilege perspective, not be made practically harder to the entity that shouldn't delete something.

Claude definitely knows how to reimplement `rm`.

torginus 11 hours ago

Why cant you ship with OverlayFS which actually enforces these restrictions?

I have seen the AI break out of (my admittedly flimsy) guards, like doing simply

safepath/../../stuff or something even more convoluted like symlinks.

eru 16 hours ago

> It's really surprising they don't just tweak what Claude uses and lock it down to where it cannot be harmful. Ensure it only ever calls tooling Claude Code provides.

That would make it far less useful in general.

KronisLV 15 hours ago
Maybe Anthropic (or some collection of the large AI orgs, like OpenAI and Anthropic and Google coming together) should apply patches on top of (or fork altogether) the coreutils and whatever you normally get in a userland - a bit like what you get in Git Bash on Windows, just with:
1) more guardrails in place
2) maybe more useful error messages that would help LLMs
3) no friction with needing to get any patches upstreamed
External tool calling should still be an option ofc, but having utilities that are usable just like what's in the training data, but with more security guarantees and more useful output that makes what's going on immediately obvious would be great.
- eru 15 hours ago
  
  So for me, it's really, really useful for Claude to be able to send Slack messages and emails or make pull requests.
  But that's also the most damaging actions it could take. Everything on my computer is backed up, but if Claude insults my boss, that would be worse.
  
  1 reply →

walthamstow 15 hours ago

Claude has told me that its Grep tool does use rg under the hood, but I constantly find it using the Bash tool with grep

giancarlostoro 4 hours ago

When I tell it to use rg it goes much faster than it using grep. I really don't understand why its slower with grep.

oefrha 17 hours ago

You can define your own rm shell alias/function and it will use that. I also have cp/mv aliases that forces -i to avoid accidental clobbering and it confuses Claude to no end (it uses cp/mv rare enough—rarer than it should, really—that I don’t bother wasting memory tokens on it).

d1sxeyes 17 hours ago
I did this, Claude detected it and decided to run /bin/rm directly.
- cogogo 13 hours ago
  
  This is terrifying. I have not used agents because I do not have a sandbox machine I do not care about. Am I crazy to worry about a sandboxed agent running on my home network? Anyone experienced anything weird by doing that?
  
  2 replies →
- cestivan 16 hours ago
  
  [dead]

troupo 15 hours ago

> Claude should be shipped by a custom implementation of

And when that fails for some reason it will happily write and execute a Python script bypassing all those custom tools