Comment by veunes

7 hours ago

The idea of "saving prompts for reproducibility" is dead on arrival. LLMs are non-deterministic by nature. In a year, they'll deprecate this model's API, and the new version will spit out completely different code with entirely new bugs for the exact same prompt. A prompt isn't source code, it's just a temporary crutch for stochastic generation. And if I have to read 50 pages of schizophrenic dialogue with an LLM just to understand why a specific function exists, that PR gets an instant reject. The artifact is and always will be readable code plus a sane commit message. Dumping a log of hallucinations will only make debugging a nightmare when this Frankenstein inevitably falls apart in prod tbh

This is something that should be possible in principle, since the machines underneath are deterministic, it’s just a limitation of the implementation.