Comment by freediver
5 months ago
The right eval tool depends on your evaluation task. Kagi LLM benchmark focuses on using LLMS in the context of information retrieval (which is what Kagi does) which includes measuring reasoning and instruction following capabilities.
No comments yet
Contribute on Hacker News ↗