Comment by XCSme
2 days ago
Funny how they didn't include Gemini 3.0 Pro in the bar chart comparison, considering that it seems to do the best in the table view.
2 days ago
Funny how they didn't include Gemini 3.0 Pro in the bar chart comparison, considering that it seems to do the best in the table view.
Also, funny how they included GPT-5.0 and 5.1 but not 5.2... I'm pretty sure they ran the benchmarks for 5.0, then 5.1 came out, so they ran the benchmarks for 5.1... and then 5.2 came out and they threw their hands up in the air and said "fuck it".
gpt-5.2 codex isn't available in the API yet.
If you want to be picky they could've compared it against gpt-5 pro gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.2 pro
all depending on when they ran benchmarks (unless, of course, they are simply copying OAI's marketing).
At some point it's enough to give OAI a fair shot and let OAI come out with their own PR, which they doubtlessly will.
I didn't even notice that, I assumed it was the latest GPT version.
after or before running the benchmarks?
Gemini is garbage and does it's own thing most of the time ignoring the instructions