Comment by justinlivi

16 hours ago

I find myself spending on average more time in LLM review/resolution loops than it would take for me to write the code by hand. Partially because once I'm in the flow I write very very quickly and the code pours out sometimes faster than I can write. But also because the LLM code on the first few tries is generally really really bad. What I find interesting though is that spending the time to personally review and direct the LLM through several iterations of review and revision on average results in higher quality code written in about the same time as I would have written it. This might be particular to me, but seeing several interations of someone else's code helps me better understand holistically my objective as opposed to whatever happens to come out of my flow-state consciousness.

[flagged]

  • I manage a component of an internal compute product which serves ~a billion idempotent use-cases per quarter and I can confidently tell you that you're incorrect.

    What I haven't been able to teach AI is the full distributed nature of the system, how we progressively roll out each service (about ~30 unique ones) when we push updates -- and how to read, write, and review my code while keeping all of this in-context (because believe me, if it's not in-context, it is useless to me). Don't get me started on all the containers, K8s configs, endpoint naming conventions...

    My entire stack covers bare metal, virtualisation infrastructure, storage infrastructure... I could go on. At a certain scale, it doesn't matter how fast you write something, but if what you're writing is bulletproof.

  • What an arrogant comment. You have no idea what kind of software the parent commenter is working on. If you think all software can be handled by AI then I'm afraid you're the one who doesn't know what they're doing.

  • Or, if we consider the fact that an LLM’s performance depends on the task’s similarity to others in the training set, it could be that one person is doing a fairly novel task and another is doing something very well represented in online code.

If your AI is writing bad code then you need to change your AI. No current high-end AI should be producing bad code.

  • This sounds like a subjective assessment. I counter with the opinion that most LLMs write technically correct, but bad code. When I read it, it makes me want to gag or poke my eyes out. I spend a lot of time wondering about what kind of person would write it like that, then I realize it’s an LLM

  • The tool is important but then so it's the way you use it. I've seen small LLMs produce good code and frontier LLMs produce poor quality code. Depending on context..

This feels like a comment from 2 years ago; by now the most modern models write much better code than humans can in much shorter time.

But if you're not used to code reviewing, it can certainly help to still write yourself.

  • > models write much better code than humans can

    What? I think this is either over exaggerating model capabilities or you haven't seen much good code from humans?

    My experience is that my colleagues which have bought into model-first development have regressed in quality of the PRs they send out. LLMs are not better coders, in my experience. They lack holistic understanding and often need course correction for that reason. At least in medium to highly complicated systems.

    • Over my time in the industry I've become increasingly convinced most people haven't seen what good human programmers are like. Otherwise we wouldn't have the popularity of things like Scrum, Clean Code (the book, not the concept), etc.

      I was lucky enough to see some good teams when I was a student (both at Berkeley itself and by interning at Jane Street), and it totally changed my intuition for what good programming is like. It's gotten to the point where I'm convinced there are two incommensurable paradigms in programming, and we're constantly talking past each other.

      Like, if you have an ongoing project where the codebase has grown over time, do you expect it to get easier to do things or harder? I've worked on projects where it's obvious that things are always getting harder (old code is hard to change, you have to deal with lots of complexity and edge cases and workarounds). I've also worked in codebases where things got easier over time: you get better abstractions, more libraries, more capabilities. That can be a lot of fun; you think of a new thing to try, and you have the pieces to just do it.

      Or another point of comparison: do people think that writing good code slows you down (so it only makes sense to avoid bugs), or do people think that writing good code lets you move faster? I've talked to people for whom one or the other is totally and obviously true. (I'm solidly in the second camp myself.)

      But the surprising thing was how "obvious" the dynamic was in both cases, even though the two cases are exact opposites of each other! If you ask one group or the other they'd just tell you that, well, that's simply how programming works. Of course things get (easier|harder) over time. That's built into people's fundamental understanding of what programming is and how to do it. And that's exactly what I mean by incommensurable paradigms.

      Anyway, this is a bit of a tangent from the main discussion, but it's something I've been thinking about a bunch over the last few years, partly inspired by the advent of AI-powered programming, but largely thanks to experiencing some very different projects and teams...

      2 replies →

    • Fair enough; I'm talking about relatively "small" snippets, that with reasoning algorithms, can quickly give you a better result than you would get if you let a mediocre or even senior developer would give an hour.

      Managing a complete codebase, making architectural decisions, designing business logic; that is not something you should let your agent do.

      But I see that as a different task from "coding".