Comment by kaba0
2 years ago
Parallelism. There might be actions that are not order-independent, and the state of the CPU might result in slightly different binaries, but all are correct.
2 years ago
Parallelism. There might be actions that are not order-independent, and the state of the CPU might result in slightly different binaries, but all are correct.
Why does this matter though? Why does order of compilation result in a different binary?
Just some random, made up example: say you want to compile an OOP PL that has interfaces and implementations of that. You discover reachable implementations through static analysis, which is multi-threaded. You might discover implementations A,B,C in any order — but they will get their methods placed in the jump table based on this order. This will trivially result in semantically equivalent, but not binary-equivalent executables.
Of course there would have been better designs for this toy example, but binary reproducibility is/was usually not of the highest priority historically in most compiler infrastructures, and in some cases it might be a relatively big performance regression to fix, or simply just a too big refactor.
Because order of completion of the parallel tasks is not guaranteed, if all tasks write to the same file you might get a different result each time.
> There might be actions that are not order-independent, and the state of the CPU might result in slightly different binaries, but all are correct.
Well no: that's really the thing reproducible packages are showing: there's only one correct binary.
And it's the one that's 100% reproducible.
I'd even say that that's the whole point: there's only one correct binary.
I'll die on the hill that if different binaries are "all correct", then none are: for me they're all useless if they're not reproducible.
And it looks like people working on entire .iso being fully bit-for-bit reproducible are willing to die on that hill too.
"Correct" does not mean "reproducible" just because you think lowly of irreproducible builds.
A binary consisting of foo.o and bar.o is correct whether foo.o was linked before bar.o or vice versa, provided that both foo.o and bar.o were compiled correctly.
See my reply to the sibling post — binary reproducibility is not the end goal. It is an important property, and I do agree that most compiler toolchains should strive for that, but e.g. it might not be a priority for, say, a JIT compiler.