Comment by simonw

4 months ago

Yeah, there remains a very real problem where a prompt injection against a system without external communication / ability to trigger harmful tools can still influence the model's output in a way that misleads the human operator.

0 comments