Comment by XCSme

2 months ago

Funny how they didn't include Gemini 3.0 Pro in the bar chart comparison, considering that it seems to do the best in the table view.

6 comments

XCSme

jychang 2 months ago

Also, funny how they included GPT-5.0 and 5.1 but not 5.2... I'm pretty sure they ran the benchmarks for 5.0, then 5.1 came out, so they ran the benchmarks for 5.1... and then 5.2 came out and they threw their hands up in the air and said "fuck it".

rynn 2 months ago

gpt-5.2 codex isn't available in the API yet.
If you want to be picky they could've compared it against gpt-5 pro gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.2 pro
all depending on when they ran benchmarks (unless, of course, they are simply copying OAI's marketing).
At some point it's enough to give OAI a fair shot and let OAI come out with their own PR, which they doubtlessly will.
XCSme 2 months ago

I didn't even notice that, I assumed it was the latest GPT version.
amelius 2 months ago

after or before running the benchmarks?

guluarte 2 months ago

Gemini is garbage and does it's own thing most of the time ignoring the instructions