Comment by what-the-grump
15 days ago
I’ve seen ChatGPT generate bad English and I’ve seen the layer or logic / UI re-render the page as I think there is a simple spell checker that kicks in and tells the api to re-render and recheck.
I don’t believe for one second that LLMs reason, understand, know, anything.
There are plenty of times LLMs fail to generate correct sentences, and plenty of times they fail to generate correct words.
Around the time ChatGPT rolled out web search inside actions, you’d get really funky stuff back and watch other code clearly try to catch the run away.
o3 can be hot garbage if you ask it expand a specific point inside a 3 paragraph memo, the reasoning models perform very, very poorly when they are not summarizing.
There are times where the thing works like magic, other times, asking it to write me a PowerShell script that gets users by first and last name has it inventing commands that flags that don’t exist.
If the model ‘understood’, ‘followed, some sort of structure outside parroting stuff it already knows about it would be easy to spot and guide it via prompts. That is not the case even with the most advanced models today.
It’s clear that LLMs work best at specific small tasks that have a well established pattern defined in a strict language or api.
I’ve broken o3 trying to have it lift working python code, into formal python code, how? The person that wrote the code didn’t exactly code it how a developer would code a program. 140 lines of basic grab some data generate a table broke the AI and it had the ‘informal’ solution in the prompt. So no there is zero chance LMMs do more than predict.
And to be clear, it one shot a whole thing for me last night, using the GitHub/Codex/agent thing in VS code, probably saved me 30 minutes but god forbid you start from a bad / edge / poorly structured thing that doesn’t fit the mould.
No comments yet
Contribute on Hacker News ↗