Comment by YetAnotherNick
5 days ago
They compared with Llama 3.1 and found that to be better on average for their tasks like European MMLU. And Llama 3.1 is the worst in the batch with Qwen 2.5 and Gemma 3 being significantly better.
5 days ago
They compared with Llama 3.1 and found that to be better on average for their tasks like European MMLU. And Llama 3.1 is the worst in the batch with Qwen 2.5 and Gemma 3 being significantly better.
No comments yet
Contribute on Hacker News ↗