Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library

Comment by zone411

7 hours ago

I built two related benchmarks this month: https://github.com/lechmazur/sycophancy and https://github.com/lechmazur/persuasion. There are large differences between LLMs. For example, good luck getting Grok to change its view, while Gemini 3.1 Pro will usually disagree with the narrator at first but then change its position very easily when pushed.

0 comments

zone411

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities