Comment by zone411
7 hours ago
I built two related benchmarks this month: https://github.com/lechmazur/sycophancy and https://github.com/lechmazur/persuasion. There are large differences between LLMs. For example, good luck getting Grok to change its view, while Gemini 3.1 Pro will usually disagree with the narrator at first but then change its position very easily when pushed.
No comments yet
Contribute on Hacker News ↗