Comment by martinald
5 months ago
I find the webdev arena tends to match my experience with models much more closely than other benchmarks: https://web.lmarena.ai/leaderboard. Excited to see how 3.7 performs!
5 months ago
I find the webdev arena tends to match my experience with models much more closely than other benchmarks: https://web.lmarena.ai/leaderboard. Excited to see how 3.7 performs!
No comments yet
Contribute on Hacker News ↗