Comment by subroutine
12 hours ago
At 20 min per task you might as well code it yourself. Bill James needs to write a book on saber-metrics for LLM benchmarks.
12 hours ago
At 20 min per task you might as well code it yourself. Bill James needs to write a book on saber-metrics for LLM benchmarks.
No comments yet
Contribute on Hacker News ↗