← Back to context

Comment by throw219080123

6 days ago

> consider the fact that both Gemini and OpenAI got gold medal level performance

Yet ChatGPT 5 imagines API functions that are not there and cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub.

Which is why you run it in a coding agent loop using something like Codex CLI - then it doesn't matter if it imagines a non-existent function because it will correct itself when it tries to run the code.

Can you expand on "cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub"? I have it do that all the time and it works really well for me (at least with modern "reasoning" models like GPT-5 and Claude 4.)

As a human, I sometimes write code that does not compile first try. This does not mean that I am stupid, only that I can make mistakes. And the development process has guardrails against me making mistakes, namely, running the compiler.

Yes. I don't see why these have to be mutually exclusive.

  • I feel they are mutually inclusive! I don’t think you can meaningfully create new things if you must always be 100% factually correct, because you might not know what correct is for the new thing.