Comment by TeriyakiBomb
8 hours ago
I don’t think it’s solvable. And I think Anthropic etc know it. LLMs can only reconstitute things in its training data and they are so hungry they can’t do a good job in long lived codebase full of complexity and novelty. There’s never going to be enough similar code on the open internet.
> LLMs can only reconstitute things in its training data
Such as a 4D raytracing engine in Metal? Or integrating APIs for features first released months after their knowledge cut-off date?
LLMs have shown an ability to transfer "knowledge" and capabilities across domains, languages, and use-cases outside their training data.
Case in point: GPT-2 "learning" to translate English to French and vice versa despite non-English examples having been voluntarily (and almost entirely) removed from the dataset.
Was this in the GPT2 paper?
In "Language Models are Unsupervised Multitask Learners"[0]. Not sure whether it’s "the" GPT-2 paper.
3.7 Translation
> Performance on this task was surprising to us, since we deliberately removed non-English webpages from WebText as a filtering step. In order to con- firm this, we ran a byte-level language detector2 on WebText which detected only 10MB of data in the French language […]
[0]: https://cdn.openai.com/better-language-models/language_model...