Comment by tombert
12 hours ago
I actually think that might actually be a good path forward.
I hate self-promotion but I posted my opinions on this last night https://blog.tombert.com/Posts/Technical/2026/04-April/Stop-...
The tl;dr of this is that I don't think that the code itself is what needs to be preserved, the prompt and chat is the actual important and useful thing here. At some point I think it makes more sense to fine tune the prompts to get increasingly more specific and just regenerate the the code based on that spec, and store that in Git.
> At some point I think it makes more sense to fine tune the prompts to get increasingly more specific and just regenerate the the code based on that spec, and store that in Git.
Generating code using a non-deterministic code generator is a bold strategy. Just gotta hope that your next pull of the code slot machine doesn’t introduce a bug or ten.
We're already merging code that has generated bugs from the slot machine. People aren't actually reading through 10,000 line pull requests most of the time, and people aren't really reviewing every line of code.
Given that, we should instead tune the prompts well enough to not leave things to chance. Write automated tests to make sure that inputs and outputs are ok, write your specs so specifically that there's no room for ambiguity. Test these things multiple times locally to make sure you're getting consistent results.
> Write automated tests to make sure that inputs and outputs are ok
Write them by hand or generate them and check them in? You can’t escape the non-determinism inherent in LLMs. Eventually something has to be locked in place, be it the application code or the test code. So you can’t just have the LLM generate tests from a spec dynamically either.
> write your specs so specifically that there's no room for ambiguity
Using English prose, well known for its lack of ambiguity. Even extremely detailed RFCs have historically left lots of room for debate about meaning and intention. That’s the problem with not using actual code to “encode” how the system functions.
I get where you’re coming from but I think it’s a flawed idea. Less flawed than checking in vibe-coded feature changes, but still flawed.
1 reply →
Infrastructure-as-code went through this exact cycle. Declarative specs were supposed to replace manual config, but Terraform still needs state files because specs drift from reality. Prompts have it worse since you can't even diff what changed between two generation runs.
This is actually a pretty good callout.
Observability into how a foundation model generated product arrived to that state is significantly more important than the underlying codebase, as it's the prompt context that is the architecture.
Yeah, I'm just a little tired of seeing these pull requests of multi-thousand-line pull requests where no one has actually looked at the code.
The solution people are coming up with now is using AI for code reviews and I have to ask "why involve Git at all then?". If AI is writing the code, testing the code, reviewing the code, and merging the code, then it seems to me that we can just remove these steps and simply PR the prompts themselves.
> why involve Git at all then?
I made a similar point 3 weeks ago. It wasn't very well received.
https://news.ycombinator.com/item?id=47411693
You don't actually need source control to be able to roll back to any particular version that was in use. A series of tarballs will let you do that.
The entire purpose of source control is to let you reason about change sets to help you make decisions about the direction that development (including bug fixes) will take.
If people are still using git but not really using it, are they doing so simply to take advantage of free resources such as github and test runners, or are they still using it because they don't want to admit to themselves that they've completely lost control?
4 replies →
Yep.
Also, the approach you described is what a number of AI for Code Review products are using under-the-hood, but human-in-the-loop is still recognized as critical.
It's the same way how written design docs and comments are significantly more valuable than uncommented and undocumented source.