← Back to context

Comment by tptacek

15 hours ago

Median-quality code is extraordinarily valuable. It is most of the load-bearing code people actually ship. What's almost certainly happening here is that you and I have differing definitions of "median-quality" commercial code.

I'm pretty sure that if we triangle-tested (say) a Go project from 'jerf and Gemini 2.5 Go output for the same (substantial; say, 2,000 lines) project --- not whatever Gemini's initial spew is, but a final product where Gemini is the author of 80+% of the lines --- you would not be able to pick the human code out from the LLM code.

This is probably true. I'm using your "median-quality" label, but that would be a generous description of the code I'm getting from LLMs.

I'm getting median-quality junior code. If you're getting median-quality commercial code, then you are speaking better LLMish than I.

  • A couple prompt/edit "cycles" into a Cursor project, Gemini's initial output gives me better-than-junior code, but still not code I would merge. But you review that code, spot the things you don't like (missed idioms, too much repetition, weird organization) and call them out; Gemini goes and fixes them. The result of that process is code that I would merge (or that would pass a code review).

    What I feel like I keep seeing is people who see that initial LLM code "proposal", don't accept it (reasonably!), and end the process right there. But that's not how coding with an LLM works.