Comment by bennyfreshness
13 hours ago
Wow. I'm generally in the AI maximalist camp. But adding Werewolf feels dangerous to me. Anyone who's played knows lying, deceipt, and manipulation is often key to winning. We really want models climbing this benchmark?
Oddly in the highlighted game I watched the werewolf simply gives up in the last round and says I'm the werewolf well-done... Vote me.
Bizarre.
This is a legitimate strategy for the werewolf, no?
Good question, but who's going to stop them?
AI already has a very creative imagination for role play so this just adds extra to their arsenal.
negative benchmark isn't it? no sane lab is going to realease PR that states our newest model is best at lying, if anything the reverse may occur, if this catches on, they will make their model play werewolf badly and claim "alignment improvements, our model no longer lies as much in werewolf" but it lies more often in other domains
confidently and charismatically lying to clueless users has been one of fundaments of AI adoption