Comment by ekropotin

18 days ago

IDK how everyone else feel about it, but a non-deterministic “compiler” is the last thing I need.

39 comments

ekropotin

ChrisGreenHeur 18 days ago

I may have bad news for you on how compilers typically work.

sarchertech 18 days ago
The difference is that what most languages compile to is much much more stable than what is produced by running a spec through an LLM.
A language or a library might change the implementation of a sorting algorithm once in a few years. An LLM is likely to do it every time you regenerate the code.
It’s not just a matter of non-determinism either, but about how chaotic LLMs are. Compilers can produce different machine code with slightly different inputs, but it’s nothing compared to how wildly different LLM output is with very small differences in input. Adding a single word to your spec file can cause the final code to be far more unrecognizably different than adding a new line to a C file.
If you are only checking in the spec which is the logical conclusion of “this is the new high level language”, everyone you regenerate your code all of the thousands upon thousands of unspecified implementation details will change.
Oops I didn’t think I needed to specify what going to happen when a user tries to do C before A but after B. Yesterday it didn’t seem to do anything but today it resets their account balance to $0. But after the deployment 5 minutes ago it seems to be fixed.
Sometimes users dragging a box across the screen will see the box disappear behind other boxes. I can’t reproduce it though.
I changed one word in my spec and now there’s an extra 500k LOC to implement a hidden asteroids game on the home page that uses 100% of every visitor’s CPU.
This kind of stuff happens now, but the scale with which it will happen if you actually use LLMs as a high level language is unimaginable. The chaos of all the little unspecified implementation details constantly shifting is just insane to contemplate as user or a maintainer.
- acuozzo 18 days ago
  
  > A language or a library might change the implementation of a sorting algorithm once in a few years.
  I think GP was referring to heuristics and PGO.
  
  1 reply →
hndc 18 days ago
Deterministic compilation, aka reproducible builds, has been a basic software engineering concept and goal for 40+ years. Perhaps you could provide some examples of compilers that produce non-deterministic output along with your bad news.
- mike_hearn 18 days ago
  
  JIT compilers.
  
  2 replies →
- booleandilemma 18 days ago
  
  Account created 11 months ago. They're probably just some slop artist with too much confidence. They probably don't even know what a compiler is.
  
  1 reply →
jcranmer 18 days ago

Compilers aim to be fully deterministic. The biggest source of nondeterminism when building software isn't the compiler itself, but build systems invoking the compiler nondeterministically (because iterating the files in a directory isn't necessarily deterministic across different machines).
csmantle 18 days ago

If you are referring to timestamps, buildids, comptime environments, hardwired heuristics for optimization, or even bugs in compilers -- those are not the same kind of non-determinism as in LLMs. The former ones can be mitigated by long-standing practices of reproducible builds, while the latter is intrinsic to LLMs if they are meant to be more useful than a voice recorder.
rezonant 18 days ago
You'll need to share with the class because compilers are pretty damn deterministic.
- fragmede 18 days ago
  
  Only mostly, and only relatively recently. The first compiler is generally attributed to Grace Hopper in 1952. 2013 is when Debian kicked off their program to do bit-for-bit reproducible builds. Thirteen years later, Nixos can maybe produce bit-for-bit identical builds if you treat her really well. We don't look into the details because it just works and we trust it to work, but because computers are all distributed systems these days, getting a bit-for-bit identical build out of the compiler is actually freaking hard. We just trust them to work well enough (and they do), but they've had three fourths of a century to get there.
- pjmlp 18 days ago
  
  Not if they are dynamic compilers.
  Two runs of the same programme can produce different machine code from the JIT compiler, unless everything in the universe that happened in first execution run, gets replicated during the second execution.
  
  8 replies →
leptons 18 days ago
Compilers are about 10 orders of magnitude more deterministic than LLMs, if not more.
- misiek08 18 days ago
  
  Currently it’s about closing that gap.
  And 10 orders is optimistic value - LLMs are random with some probability of solving the real problem (and I think of real systems, not a PoC landing page or 2-3 models CRUD) now. Every month they are now getting visibly better of course.
  The „old” world may output different assembly or bytecode everytime, but running it will result in same outputs - maybe slower, maybe faster. LLMs now for same prompt can generate working or non-working or - faking solution.
  As always - what a time to be alive!
JackSlateur 18 days ago

Reproductible builds are a thing (that are used in many many places)
Applejinx 18 days ago
I love the 'I may have' :)
- ChrisGreenHeur 18 days ago
  
  made some people angry at me :)
r0b05 18 days ago

Elaborate please

booleandilemma 18 days ago

Well I've been seeing on HN how everyone else feels about it and I'm terrified.

pjmlp 18 days ago

I use them everywhere since the late 1990's, it is called managed runtime.

discreteevent 18 days ago
That is a completely different category. I've never experienced a logic error due to a managed runtime and only once or twice ever due to a C++ compiler.
- pjmlp 18 days ago
  
  I certainly already experienced crashes due to JIT miscompilations, even though it was a while back, on Websphere with IBM Java implementation.
  Also it is almost impossible to guarantee two runs of an application will trigger the same machine code output, unless the JIT is either very dumb on its heuristics and PGO analysis, or one got lucky enough to reproduce the same computation environment.
  
  2 replies →

rzmmm 18 days ago

I think it's technically possible to achieve determinism with LLM output. The LLM makers typically make them non-deterministic by default but it's not inherent to them.

robrenaud 18 days ago

A compiler that can turn cash into improved code without round tripping a human is very cool though. As those steps can get longer and succeed more often in more difficult circumstances, what it means to be a software engineer changes a lot.

void-star 18 days ago

LLMs may occasionally turn bad code into better code but letting them loose on “good” or even “good enough” code is not always likely to make it “better”.
tjr 18 days ago

What compiler accepts cash as input?