Comment by bryanh

2 years ago

Page 7 of their technical report [0] has a better apples to apples comparison. Why they choose to show apples to oranges on their landing page is odd to me.

[0] https://storage.googleapis.com/deepmind-media/gemini/gemini_...

4 comments

bryanh

polygamous_bat 2 years ago

I assume these landing pages are made for wall st analysts rather than people who understand LLM eval methods.

bryanh 2 years ago
True, but even some of the apples to apples is favorable to Gemini Ultra 90.04% CoT@32 vs. GPT-4 87.29% CoT@32 (via API).
- dongobread 2 years ago
  
  This isn't apples to apples - they're taking the optimal prompting technique for their own model, then using that technique for both models. They should be comparing it against the optimal prompting technique for GPT-4.
rockinghigh 2 years ago

Showing dominance in AI is also targeted at their entreprise customers who spend millions on Google Cloud services.