Comment by trimethylpurine
17 days ago
Couldn't we slowly add guardrails that eventually lead to code generation becoming more and more deterministic over time?
I'm seeing in my experience that Claude has become better with every version at producing uniformity in its code output. Especially where the architecture is clear and documented. And even more so in languages with built in uniformity (Go, HTMX, SQL) where there is intentionally only one or two ways of doing things. In such environments, the output is nearly deterministic.
I once thought about this and found that n-shots makes greater influences on LLMs. In other words, in a repo with good code quality and architecture (which offers good n-shots) and on a task with clear instructions and goals, LLM's output seems reliable enough, which meets your opinion. And n-shots is always better than relying on instruction following, instruction following mentioned in the article ("specifications") as an approach facing LLM's productivity, so imo the idea you suggested is another probability against/comparing with the article as well.