Comment by culi
1 month ago
I posted this elsewhere but thought I'd repost here:
* https://lmarena.ai/leaderboard — crowd-sourced head-to-head battles between models using ELO
* https://dashboard.safe.ai/ — CAIS' incredible dashboard
* https://clocks.brianmoore.com/ — a visual comparison of how well models can draw a clock. A new clock is drawn every minute
* https://eqbench.com/ — emotional intelligence benchmarks for LLMs
* https://www.ocrarena.ai/battle — OCR battles, ELO
* https://mafia-arena.com/ — LLMs playing the social deduction game Mafia
* https://openrouter.ai/rankings — marketshare based on OpenRouter
No comments yet
Contribute on Hacker News ↗