Comment by majormajor

11 hours ago

Yeah I've been able to get great Python functions out of everything since the ChatGPT 4 API in early-to-mid 2023.

It takes far less manual prompting to make it have consistent output, work well with other languages, etc. But if you watch the "thinking" logs it looks an awful lot like the "prompt engineering" you'd do by hand back then. And the output for tricky cases still sometimes goes sideways in obviously-naive-ways. The most telling thing in my experience is all the grepping, looping, refining - it's not "I loaded all twenty of these files into context and have such a perfect understanding of every line's place in the big picture that I can suggest a perfect-the-first-time maximally-elegant modification." It's targeted and tactical. Getting really good at its tactics for that stuff, though!

I can get more done now than a year ago because taking me out of the annoying part of that loop is very helpful.

But there's still a very curious gap that the tool that can quickly and easily recognize certain type of bugs if you ask them directly will also happily spit out those sorts of bugs while writing the code. "Making up fake functions" doesn't make it to the user much anymore, but "not going to be robust in production but technically satisfies the prompt" still does, despite it "knowing better" when you ask it about the code five seconds later.