Comment by andyhedges
7 hours ago
For the LLM it's a probabilistic set of strings that achieves the outcome, the highest probability set didn't work, try the next one until success or threshold met. A human sees the implicit difference between the obvious thing not working indicating someone doesn't want you to do it, but an LLM unless guided doesn't seen that sub-text.
So chmod +x file didn't work, now try python -c "import os; os.chmod('file',744)"
Humans and LLMs both only see that when given the right context. A tool not working in a corporate environment may be anything from oversight, malfunction all the way to security block. Knowing which one it is takes a lot of implicit knowledge. Most people fail to provide this level of context to their LLMs and then wonder why they act so generic. But they are trained to act in the most generic way unless given context that would deviate from it.