Comment by theturtletalks

1 month ago

All of this seems like manufactured hype for Gemini. I use GPT-5.2, Opus 4.5, and Gemini 3 flash and pro with Droid CLI and Gemini is consistently the worst. It gets stuck in loops, wants to wipe projects when it can’t figure out the problem, and still fails to call tools consistently (sometimes the whole thread is corrupted and you can’t rewind and use another model).

Terminal Bench supports my findings, GPT-5.2 and Opus 4.5 are consistently ahead. Only Junie CLI (Jetbrains exclusive) with Gemini 3 Flash scores somewhat close to the others.

It’s also why Ampcode made Gemini the default model and quickly back tracked when all of these issues came to light.

4 comments

theturtletalks

AlexCoventry 1 month ago

https://paulgraham.com/submarine.html

petesergeant 1 month ago

Claude for writing the code, Codex for checking the code, Gemini for when you want to look at a pretty terminal UI.

j45 1 month ago

Gemini is pretty decent at ingesting and understanding large codebases before providing it to Claude.

alex1138 1 month ago

I'm pretty high on Claude, though not an expert on coding or LLMs at all

I'm naturally inclined to dislike Google from what they censor, what they consider misinformation, and just, I don't know, some of the projects they run (many good things, but also many dead projects and lying to people)