Comment by jeremyjh

2 hours ago

Even the strongest frontier model they used - GPT 5.2 - I would consider barely usable for agentic programming.

I’m not really interested in analysis of the weaknesses of such models because in my experience many weaknesses disappear entirely as models get stronger and reasoning effort is turned up. Especially if you tell them what you want them to do.

Also, it’s not surprising to learn that when more acceptance criteria are added the failure rate increases.

1 comment

jeremyjh

sigbottle 1 hour ago

Wait isn't gpt 5.2 good? Or is it not thinking / not codex? 5.2 was what sparked the late 2025 openai agentic programming revolution.