Comment by ifwinterco

4 hours ago

On benchmarks GPT 5.2 was roughly equivalent to Opus 4.5 but most people who've used both for SWE stuff would say that Opus 4.5 is/was noticeably better

7 comments

ifwinterco

CraigJPerry 2 hours ago

There's an extended thinking mode for GPT 5.2 i forget the name of it right at this minute. It's super slow - a 3 minute opus 4.5 prompt is circa 12 minutes to complete in 5.2 on that super extended thinking mode but it is not a close race in terms of results - GPT 5.2 wins by a handy margin in that mode. It's just too slow to be useable interactively though.

ifwinterco 1 hour ago

Interesting, sounds like I definitely need to give the GPT models another proper go based on this discussion

georgeven 3 hours ago

Interesting. Everyone in my circle said the opposite.

krzyk 3 hours ago
It probably depends on programming language and expectations.
- ifwinterco 2 hours ago
  
  This is mostly Python/TS for me... what Jonathan Blow would probably call not "real programming" but it pays the bills
  They can both write fairly good idiomatic code but in my experience opus 4.5 is better at understanding overall project structure etc. without prompting. It just does things correctly first time more often than codex. I still don't trust it obviously but out of all LLMs it's the closest to actually starting to earn my trust

elAhmo 3 hours ago

I mostly used Sonnet/Opus 4.x in the past months, but 5.2 Codex seemed to be on par or better for my use case in the past month. I tried a few models here and there but always went back to Claude, but with 5.2 Codex for the first time I felt it was very competitive, if not better.

Curious to see how things will be with 5.3 and 4.6

SatvikBeri 1 hour ago

I pretty consistently heard people say Codex was much slower but produced better results, making it better for long-running work in the background, and worse for more interactive development.