← Back to context

Comment by bee_rider

3 hours ago

The models themselves should not be able to phone home, right? They are just piles of weights that generate text (and associated metadata), they don’t have any ability to run code.

They could be trained to generate code that would phone home. But these are just tools, anybody doing the right thing and checking and understanding every line of code that they use an LLM to generate has nothing to worry about.

Nobody is only generating code. Many are letting agents run commands. Agents routinely write scripts and run tools in the background. Agents who have been told they can only do `cat` and `grep` can sometimes do `cat $EVIL_PAYLOAD | bash`. It's entirely possible for a model to have malicious commands designed for agents to execute baked in.