Comment by olooney

2 hours ago

> undeterministic abstraction

I've seen people argue that LLMs will just add another layer to the top of the compiler stack: instead of writing code, we'll use English, and run it through a pipeline:

    English -> Rust -> ASM -> Machine Code

What's one more layer, right?

But what the author says about agents being "undeterministic abstraction" shows why that will never work.

Compilers rely on a concept called observational equivalence[1] to define when two programs are basically the same; this allows them to make changes under the hood like unrolling a loop or targeting another machine. Now, it turns out we know a lot about how and how not to do this, thanks to a logician named Frege who worked out exactly which properties a "definition" would need to have to count as a definition without becoming an axiom. In particular, that it should be "eliminable" and "conservative"[2]. In plain language, that a formal definition should always be able to be eliminated by rote string substitution, and that it shouldn't smuggle in any extra assumptions. When we talk about things like syntactic sugar[3] or hygienic macros[4], we are basically applying Frege's two conditions to programming languages.

LLMs are neither. They cannot reliably or provably go from the prompts they are given to the source code they generate, and they make a ton of implicit assumptions when they do so. There can never be any equivalence between two "prompts" in the same way that two programs can be equivalent modulo some level of abstraction. The whole process of starting from prompts is wildly nondeterministic, which is why the only pattern that works is to generate the code, review it, and test it, and then check it in and use that as the starting point for the next prompt.

Which is not to say that LLMs aren't useful for code generation; they clearly are. But they don't provide an abstraction that lets us get away from the details of actual code, and thanks to Frege we can understand why they never will.

I can say all this with such confidence because I did once write a wild little Python library that used a bunch of introspection to actually do this[5]. And it absolutely did not work in practice beyond toy examples.

[1]: https://en.wikipedia.org/wiki/Observational_equivalence

[2]: https://plato.stanford.edu/entries/frege/#ProDef

[3]: https://en.wikipedia.org/wiki/Syntactic_sugar

[4]: https://en.wikipedia.org/wiki/Hygienic_macro

[5]: https://github.com/olooney/fourth_gen

0 comments

olooney

No comments yet

Contribute on Hacker News ↗