Comment by jtrn

9 hours ago

People's mileage may vary, but in my instance, this was so bad that I actually got angry while trying to use it.

It's slow and stupid. It does not do proper research. It does not follow instructions. It randomly decides to stop being agentic, and instead just dumps the code for me to paste. It has the extremely annoying habit of just doing stuff without understanding what I meant, making a mess, then claiming everything is fine. The outdated training data is extremely annoying when working with Nuxt 4+. It is not creative at solving problems. It dosent show the thinking. The Undo code does not give proper feedback on the diff and if it actually did "undo." And I hate the personality. It HAS to be better than it comes off for me because I am actually in a bad mood after having worked with it. I would rather YOLO code with Gemini 3 flash, since it's actually smarter in my assessment, and at least I can iterate faster, and it feels like it has better common sense.

Just as an example, I found an old, terrible app I made years ago for our firm that handles room reservations. I told it to update from Bootstrap to Flowbite UI. Codex just took forever to make a mess, installed version 2.7 when 4.0.1 is the latest, even when I explicitly stated that it should use the absolute latest version. Then it tried to install it and failed, so it reverted to the outdated CDN.

I gave the same task to Claude Code. Same prompt. It one-shotted it quickly. Then I asked it to swap out ALL the fetch logic to have SPA-like functionality with the new beta 4 version of HTMX, and it one-shot that too in the time Codex spent just trying to read a few files in the project.

This reminds me of the feeling I had when I got the Nokia N800. It was so promising on paper, but the product was so bad and terrible to use that I knew Nokia was done for. If this was their take on what an acceptable smartphone could be, it proves that the whole foundation is doomed. If this is OpenAI's take on what an agentic coding assistant should be—something that can run by itself and iterate until it completes its task in an intelligent and creative way.... OpenAI is doomed.

8 comments

jtrn

prodigycorp 6 hours ago

If you're using 5.2 high, with all due respect, this has to be a skill issue. If you're using 5.2 Codex high — use 5.2 high. gpt-5.2 is slow, yes (ok, keeping it real, it's excruciatingly slow). But it's not the moronic caricature you're saying it is.

If you need it to be up to date with your version of a framework, then ask it to use the context7 mcp server. Expecting training data to be up to date is unreasonable for any LLM and we now have useful solutions to the training data issue.

If you need it to specify the latest version, don't say "latest". That word would be interpreted differently by humans as well.

Claude is well known at its one-shotting skills. But that's at the expense of strict instruction following adherence and thinner context (it doesn't spend as much time to gather context in larger codebases).

stitched2gethr 1 hour ago
Perhaps if he was able to get Claude Code to do what he wanted in less time, and with a better experience, then maybe that's not a skill he (or the rest of us) want to develop.
- prodigycorp 37 minutes ago
  
  Sure, that's fine. I wrote my comment for the people who don't get angry at an AI agents after using them for the first time within five hours of their release. For those who aren't interested in portending doom for OpenAI.
  Some things aren't common sense yet so I'm trying my part to make them so.
hyeomans 3 hours ago

Ty for the tip on context7 mcp btw
jtrn 2 hours ago

Ok. You do you. I'll stick with the models that understand what latest version of a framework means.

chandureddyvari 4 hours ago

Agreed, had the same experience. Codex feels lazy - I have to explicitly tell it to research existing code before it stops giving hand-wavy answers. Doc lookup is particularly bad; I even gave it access to a Context7 MCP server for documentation and it barely made a difference. The personality also feels off-putting, even after tweaking the experimental flag settings to make it friendlier.

For people suggesting it’s a skill issue: I’ve been using Claude Code for the past 6 months and I genuinely want to make Codex work - it was highly recommended by peers and friends. I’ve tried different model settings, explicitly instructed it to plan first and only execute after my approval, tested it on both Python and TypeScript backend codebases. Results are consistently underwhelming compared to Claude Code.

Claude Code just works for me out of the box. My default workflow is plan mode - a few iterations to nail the approach, then Claude one-shots the implementation after I approve. Haven’t been able to replicate anything close to that with Codex

adithyassekhar 6 hours ago

I'm not taking OpenAI's side here but have you reviewed what claude did?

spiderfarmer 6 hours ago

I only use claude through the chat ui because it’s faster and it gives me more control. I read most of it and the code is almost always better than what I would do, simply because lazy ass me likes to take shortcuts way too often.