← Back to context Comment by senko 3 months ago With 2T params (!!), it better outperform everything else. 2 comments senko Reply amarcheschi 3 months ago Given that the comparison doesn't include O3 or gemini pro 2.5, I'd say it doesn't. Looking both at the comparison table available for llama 4 behemoth and gemini pro 2.5 it seems like at least a few of the comparable items might be won by geminihttps://blog.google/technology/google-deepmind/gemini-model-... wmf 3 months ago We don't know how many params GPT-4, Claude, and Gemini are using so it could be in the ballpark.
amarcheschi 3 months ago Given that the comparison doesn't include O3 or gemini pro 2.5, I'd say it doesn't. Looking both at the comparison table available for llama 4 behemoth and gemini pro 2.5 it seems like at least a few of the comparable items might be won by geminihttps://blog.google/technology/google-deepmind/gemini-model-...
wmf 3 months ago We don't know how many params GPT-4, Claude, and Gemini are using so it could be in the ballpark.
Given that the comparison doesn't include O3 or gemini pro 2.5, I'd say it doesn't. Looking both at the comparison table available for llama 4 behemoth and gemini pro 2.5 it seems like at least a few of the comparable items might be won by gemini
https://blog.google/technology/google-deepmind/gemini-model-...
We don't know how many params GPT-4, Claude, and Gemini are using so it could be in the ballpark.