Comment by pjmlp
18 days ago
Not if they are dynamic compilers.
Two runs of the same programme can produce different machine code from the JIT compiler, unless everything in the universe that happened in first execution run, gets replicated during the second execution.
That’s 100% correct, but importantly JIT compilers are built with the goal of outputting semantically equivalent instructions.
And the vast, vast majority of the time, adding a new line to the source code will not result in an unrecognizably different output.
With an LLM changing one word can and frequently does cause the out to be so 100% different. Literally no lines are the same in a diff. That’s such a vastly different scope of problem that comparing them is pointless.
No, but will certainly result in a complete different sequence of machine code instructions, or not, depending on what that line actually does, what dynamic types it uses, how often it actually gets executed, the existence of vector units, and so forth.
Likewise, as long as the agent delivers the same outcome, e.g. an email is sent with a specific subject and body, the observed behaviour remains.
The reason this works for compilers is because machine code is so low level that it’s possible for compiler authors to easily prove semantic equivalence between different sets of instructions.
That is not true for an English language prompt like “send and email with this specific subject and body”. There are so many implicit decisions that have to be made in that statement that will be different every time you regenerate the code.
English language specs will always have this ambiguity.
Do these compilers sometimes give correct instructions and sometimes incorrect instructions for the same higher level code, and it's considered an intrinsic part of the compiler that you just have to deal with? Because otherwise this argument is bunk.
Possibly, hence why the discussion regarding security in JavaScript runtimes and JIT, by completely disabling JIT execution.
https://microsoftedge.github.io/edgevr/posts/Super-Duper-Sec...
Also the exact sequence of generated machine instructions depends of various factors, the same source can have various outputs, depending on code execution, preset hardware, and heuristics.
Sure, but surely you expect a `func add(a, b) { return a + b; }` to actually produce a + b, in whatever way it finds best? And if it doesn't, you can reproduce the error, and file a bug? And then someone can fix that bug?
they in fact do have bugs, yes, inescapably so (no one provides formal proofs for production level compilers)
Ok, but we treat them as bugs that we can reproduce and assume that they are solvable? We don't just assume that it's intrinsic to the compiler that it must have bugs, and that they will occur in random, non-deterministic ways?