Comment by canadaduane
3 hours ago
I'm curious what you think of as "the mean"? I consider the input training set for an LLM to contain its mean. My hypothesis would be: an LLM alone cannot consistently produce code above the mean of the quality it was trained on.
The input training doesn't matter much, besides, the input training is already skewed for code that has been submitted after much trial and error by a dev locally and possibly reviewed. And input has an over bias over open source projects, not crap internal tools no llm has ever seen.
There's more to the quality of the output, like prompts, the quality of the codebase (from which the llms learn), the documentation/harnessing, the feedback an engineer provides while reviewing multiple times (in the chat, in the diff, in the pr) etc, etc.