← Back to context

Comment by shepherdjerred

2 days ago

Are you having a positive experience with Codex compared to Claude Code? Codex in my brief experience was... not good w/ 5.1

Just to provide another datapoint - tried codex September / October after seeing the glowing reviews here, and it was, all in all, a huge letdown.

It seems to be very efficient context-wise, but at the same time made precise context-management much harder.

Opus 4.5 is quite a magnificent improvement over Sonnet 4.5, in CC, though.

Re tfa - I accidentally discovered the new lsp support 2 days ago on a side project in rust, and it’s working very well.

  • I was luke-warm about codex when I tried it 2-3 months ago, but just recently tried it again last week, running it against claude code, both of them running against the same todo list to build a docusign-like web service. I was using loops of "Look at the todo list and implement the next set of tasks" for the prompt (my prompt was ~3 sentences, but basically saying that):

        - Codex required around 30 passes on that loop, Claude did it in ~5-7.
        - I thought Codex's was "prettier", but both were functional.
        - I dug into Claude's result in more depth, and had to fix ~5-10 things.
        - Codex I didn't dig into testing quite as deeply, but it seemed to need less fixing.  Still not sure if that is because of a more superficial view.
        - Still a work in progress, have not completed a full document signing workflow in either.

  • Similar experience and timeline with codex, but tried it last week and it's gotten much better in the interim. Codex with 5.2 does a good job at catching (numerical) bugs that Opus misses. I've been comparing them and there's not a clear winner, GPT 5.2 misses things Opus finds and vice versa. But claude-code is still a much better experience and continues to just keep getting better but codex is following, just a few months behind.

  • What are some of the use cases for Claude Code + LSP ? What does LSP support let you do, or do better, that Claude Code couldn't do by itself ?

  • Another anecdote/datapoint. Same experience. It seem to mask a lot of bad model issues by not talking much and overthinking stuff. The experience turns sour the more one works with it.

    And yes +1 for opus. Anthropic delivered a winner after fucking up the previous opus 4.1 release.

  • I checked the codex source code a few months ago and the implementation was very basic compared to opencode

It goes like this:

Codex is an outsourcing company, you give specs, they give you results. No communication in between. It's very good at larger analysis tasks (code coverage, health etc). Whatever it does, it does it sloooowwwllyyy.

Claude is like a pair programmer, you can follow what it's doing, interrupt and redirect it if it starts going off track. It's very much geared towards "get it done" rather than maximum code quality.

I’m basically only using the Codex CLI now. I switched around the GPT-5 timeframe because it was reliably solving some gnarly OpenTelemetry problems that Claude Code kept getting stuck on.

They feel like different coworker archetypes. Codex often does better end-to-end (plan + code in one pass). Claude Code can be less consistent on the planning step, but once you give it a solid plan it’s stellar at implementation.

I probably do better with Codex mostly due to familiarity; I’ve learned how it “thinks” and how to prompt it effectively. Opus 4.5 felt awkward for me for the same reason: I’m used to the GPT-5.x / Codex interaction style. Co-workers are the inverse, they adore Opus 4.5 and feel Codex is weird.

I've gone it works wonderful for 5.2. I think chatgpt plus is at the top of the weekly AI rolling wars. Most bang for the buck.