Comment by xianshou

2 years ago

Marketing: Gemini 90.0% || GPT-4 86.4%, new SotA exceeding human performance on MMLU!

Fine print: Gemini 90.0% chain of thought @ 32-shot || GPT-4 86.4% @ 5-shot

Technical report: Gemini 83.7% @ 5-shot || GPT-4 86.4% @ 5-shot

Granted, this is now the second-best frontier model in the world - but after a company-wide reorg and six months of constant training, this is not what success for Google looks like.

1 comment

xianshou