Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by selcuka

1 year ago

> chess is an excellent metric for testing logical thought and internal modeling

Is it, though? Apparently nobody else cared to use it to benchmark LLMs until this article.

1 comment

selcuka

Reply

gs17  1 year ago

People had noticed this exact same discrepancy between 3.5-turbo-instruct and 4 a year ago: https://x.com/GrantSlatton/status/1703913578036904431

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities