Comment by cowlby
14 hours ago
Underrated comment here. https://www.anthropic.com/research/emotion-concepts-function This study convinced me to be "nice" to AI agents. At least as I understood it, there's something in the weights that activating the "desperate" vector makes it more likely to cheat or cut corners. So yes I would err towards your suggested prompt over NEVER FUCKING GUESS.
No comments yet
Contribute on Hacker News ↗