Comment by visarga
2 days ago
If you implement a project, keep the specs and tests and re-implement it, it should not matter the exact way it was coded as long as it was well tested. So you don't need deterministic LLMs.
I think work with LLMs should be centered on testing, since it is how the agent is fenced off in a safe space where it can move without risk. Tests are the skin, specs are the bones, and the agent is the muscle.
I think reading the code as the sole defense against errors is a grave mistake, it is "vibe testing". LGTM is something you cannot reproduce. Reading all the code is like walking the motorcycle.
The first time you generate the code, it calls the method doFoo(), and the test calls that method. The second time you generate the code, it calls the method fooify(), and the test breaks.
How do you propose to get around this, without a human specifying every class layout in detail?