Comment by burntoutgray
17 hours ago
YES! The session becomes the source code.
Back in the dark ages, you'd "cc -s hello.c" to check the assembler source. With time we stopped doing that and hello.c became the originating artefact. On the same basis the session becomes the originating artefact.
I'm not sure this analogy holds, for two reasons. First, even in the best case, chain-of-thought transcripts don't reliably tell you what the agent is doing and why it's doing it. Second, if you're dealing with a malicious actor, the transcript may have no relation to the code they're submitting.
The reason you don't have to look at assembly is that the .c file is essentially a 100% reliable and unambiguous spec of how the assembly will look like, and you will be generating the assembly from that .c file as a part of the build process anyway. I don't see how this works here. It adds a lengthy artifact without lessening the need for a code review. It may be useful for investigations in enterprise settings, but in the OSS ecosystem?...
Also, people using AI coding tools to submit patches to open-source projects are weirdly hesitant to disclose that.
This is only true if a llm session would produce a deterministic output which is not the case. This whole “LLMs are the new compiler” argument doesn’t hold water.
"Deterministic" is not the issue either, it's that small changes of the input will cause unknown changes in the output. You might theoretically achieve determinism and reproducibility for the exact same input (seeding the random number generators etc.), but the issue is that even if you formulate your request just a little differently, by changing punctuation for example, you'll get an entirely different output.
With compilers, the rules are clear, e.g. if you replace variable names with different ones, the program will still do the same thing. If you add spaces in places where whitespace doesn't matter, like around operators, the resulting behavior will still be the same. You change one function's definition, it doesn't impact another function's definition. (I'm sure you can nitpick this with some edge case, but that's not the point, it overwhelmingly can be relied upon in this way in day to day work.)
cc was deterministic, you could be confident that the same code produced the same assembly each time you ran it
That is very much not the case with LLMs
LLMs are non-deterministic, you would end up with a different output even if you paste the same conversation in. Even if the model was identical at the time you tried to reproduce it. Which gets less likely as time passes.
Also, why would you need to reproduce it? You have the code. Almost any modification to said code would benefit from a fresh context and refined prompt.
An actual full context of a thinking agent is asinine, full of busy work, at best if you want to preserve the "reason" for the commits contents maybe you could summarise the context.
Other than that I see no reason to store the whole context per commit.