Comment by mkl

1 year ago

That Gemini 2.5 one is impressive. I found it interesting that the blog post didn't mention Gemini 2.5 at all. Okay, it was released pretty recently, but 10 days seems like enough time to run the benchmarks, so maybe the results make Llama 4 look worse?

3 comments

mkl

jjani 1 year ago

I'm sure it does, as Gemini 2.5 Pro has been making every other model look pretty bad.

az226 1 year ago

Meta will most likely compare against it when they release the upcoming Llama 4 reasoning model.

utopcell 1 year ago

LM Arena ranks it second, just below Gemini 2.5 Pro.