Comment by tomashubelbauer

5 hours ago

I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

2 comments

tomashubelbauer

theshrike79 3 hours ago

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D

prodigycorp 5 hours ago

i refuse to defend the 5.2-codex models. They are awful.