Comment by sarchertech

15 hours ago

Compilers take a formal language and translate it to another formal language. In most cases there is no ambiguity, it’s deterministic, and most importantly it’s not chaotic.

That is changing one word in the source code doesn’t tend to produce a vastly different output, or changes to completely unrelated code.

Because the LLM is working from informal language, it is by necessity making thousands of small (and not so small) decisions about how to translate the prompt into code. There are far more decisions here than can reasonably fixed in tests/specs. So any changes to the prompt/spec is likely to result in unintended changes to observable behavior that users will notice and be confused by.

You’re right that programmers regularly churn out unoptimized code. But that’s very different than churning out a bubbling morass where ever little thing that isn’t bolted down is constantly changing.

The ambiguity in translation from prompt to code means that the code is still the spec and needs to be understood. Combine that with prompt instability and we’ll be stuck understanding code for the foreseeable future.

The problem you describe is real, but I think it can be addressed by improving tooling without any improvement in available LLM technology.

  • How? Are you thinking of adversarial AI reviewers, runtime tests (also by AI), or something else?

    Guess I just don't see how you can take the human out of the loop and replace them with non-deterministic AIs and informal prompts / specs.

    • Humans are also non-deterministic, though. Why does replacing one non-deterministic actor with another matter here?

      I'm not particularly swayed by arguments of consciousness, whether AI is currently capable of "thinking", etc. Those may matter right now... but how long will they continue to matter for the vast majority of use cases?

      Generally speaking, my feeling is that most code doesn't need to be carefully-crafted. We have error budgets for a reason, and AI is just shifting how we allocate them. It's only in certain roles where small mistakes can end your company - think hedge funds, aerospace, etc. - where there's safety in the non-determinism argument. And I say this as someone who is not in one of those roles. I don't think my job is safe for more than a couple of years at this point.

      1 reply →

    • > adversarial AI reviewers, runtime tests (also by AI), or something else?

      And spec management, change previews, feedback capture at runtime, skill libraries, project scaffolding, task scoping analysis, etc.

      Right now this stuff is all rudimentary, DIY, or non-existent. As the more effective ways to use LLMs becomes clearer I expect we'll see far more polished, tightly-integrated tooling built to use LLMs in those ways.

      1 reply →