Comment by ekidd

7 hours ago

> Telling people “you must read all the code generated by an LLM” is definitely meaningful—but it is not at all moderate (so most people won’t do it).

I am honestly heartbroken to live in a world where reading the code is seen as an unreasonable ask by either students or by professional working programmers.

No one is complaining about having to read code. The complaints usually fall in one of these buckets:

- having your job responsibilities being reduced to ONLY reviewing code.

- having to review code unnecessarily high scrutiny because it can hallucinate randomly and you as a human are responsible for the code even though you didn’t write it. In a traditional context when I review code, there’s a shared responsibility. Someone writes the code and another person reviews it. Now it’s entirely on the person who reviews it.

There may be other buckets, these are the ones that I hear often from other engineers.

What is heart breaking about it? Code reviews were always being the most sucky part of the job.

They are also among more recent inventions, they are not "the traditional" programming at all. It is not like code review was the thing that attracted people to the profession or something that would be ore rewarding part of it.

Don't tell me you're reading all assembly generated by your local golang or javac compiler? And that you've read every line of code down the dependency tree for your node_modules?

I'm just upset that we are throwing away the original prompts for generated code in such a cavalier fashion.

  • The difference is that a compiler is a rigorous, (nearly) determinisic, heavily tested artrifact built by expert humans. I have only encountered genuine code generation bugs in compilers twice in my career. And yes, those bugs I did trace to the assembly.

    An LLM prompt, even a huge one, is an incredibly vague document that leaves out most of the edge cases. And even Fable 5 happily ignores clear instructions in its prompt.

    Now, to be fair, I absolutely expect the buggy slop to win, and to drive out the people that either write their own code or at least read the output. This will, in turn, make customers less willing to spend money on software after they get burnt a few times by buggy garbage. I think this is pretty much inevitable once Fable returns. It's just too damn good at long time horizon tasks, generating far more mostly sorta working code than any human could reasonably read.

    • > The difference is that a compiler is a rigorous, (nearly) determinisic, heavily tested artrifact built by expert humans.

      How do you know your compiler is a rigurous and deterministic? Did you review all of its code?

      4 replies →

  • I despise this retort that i see constantly, in no way shape or form is it remotely an accurate analogy. They are two completely different things and its dishonest to attribute the two together.

    • "A compiler is free to optimize...", on sufficiently basic prompting "make me a user address collection form that writes to a database table called 'registered_users'..."

      ...I agree it's not deterministic (neither are all your variations of C compilers, neither is Firefox v Safari v Chrome), but it probably Does Something(tm), and I might not want to peel back the covers and see how it used React, or Vue, VanillaJS, QT, or GTK.

      It's upsetting that we are _committing the generated code_ rather than being able to use better and better optimizing compilers against the original prompt of: "make me a user registration form with database connection"

      ...I'm very with you on "it's not an accurate analogy", but I'm pointing out that there have been sea-changes already w.r.t. strict adherence to the generated code, or inclusion of left-pad v react libraries.

      ...and there have been corresponding productivity gains (debatable? ;-) when we've worked at these higher levels of abstraction.

      I'm personally still in the "blacksmith" stage of working with AI output (put it back in the fire and beat on it a bunch more times), and shudder in horror at the thought of maintaining (or paying to maintain) megabytes of hours of token generation that looks like source code.

      I'm hopeful that we'll eventually strip out some of the mud between the CPU and putting pixels on the screen (with the help of LLM's?), and that we'll still be able to understand and reason about the real "DAG" of what our programs are trying to do (eg: declarative guis, kindof like we have declarative sql), but there will always be a muddy middle part where the computer/complier/LLM is doing something in between that _is_ sufficiently reliable for us to ignore those bits most of the time.