Comment by YetAnotherNick
10 months ago
They compared with Llama 3.1 and found that to be better on average for their tasks like European MMLU. And Llama 3.1 is the worst in the batch with Qwen 2.5 and Gemma 3 being significantly better.
10 months ago
They compared with Llama 3.1 and found that to be better on average for their tasks like European MMLU. And Llama 3.1 is the worst in the batch with Qwen 2.5 and Gemma 3 being significantly better.
No comments yet
Contribute on Hacker News ↗