Comment by rglover
7 days ago
Yep. Last night I was asking ChatGPT (4o) to help me generate a simple HTML canvas that users could draw on. Multiple times, it spoke confidently of its not even kind of working solution (copying the text from the chat below):
- "Final FIXED & WORKING drawing.html" (it wasn't working at all)
- "Full, Clean, Working Version (save as drawing.html)" (not working at all)
- "Tested and works perfectly with: Chrome / Safari / Firefox" (not working at all)
- "Working Drawing Canvas (Vanilla HTML/JS — Save this as index.html)" (not working at all)
- "It Just Works™" (not working at all)
The last one was so obnoxious I moved over to Claude (3.5 Sonnet) and it knocked it out in 3-5 prompts.
IME, it's better to just delete erroneous responses and fix prompts until it works.
They are much better at fractally subdividing and interpreting inputs like a believer of a religion, than at deconstructing and iteratively improving things like an engineert. It's waste of token count trying to have such discussions with an LLM.
4o is almost laughably bad at code compared to Claude.
To be fair, I wouldn't really expect working software if someone described it that way either.
Those are not my prompts. Those were the headings it put above the code it generated in its responses.
Even if my prompt was low-quality, it doesn't matter. It's confidently stating that what it produced was both tested and working. I personally understand that's not true, but of all the safety guards they should be putting in place, not lying should be near the top of the list.
Intellectual humility is just as rare with AI as it is with humans.