Comment by thephyber
2 days ago
LLMs are terrible at generating code for “less commonly used languages”. They require LOTS of data for high accuracy.
I describe it this way: they are good at interpolating from what data they were trained on, but terrible at extrapolating. I agree with the parent that the LLM-generated content isn’t novel, it’s just a rehash of two things it was trained on.
I have wasted quite a number of hours trying to use LLMs to write things for less common languages. Sure they can one-shot some impressive stuff in C#, Python, and JavaScript… but try working in Object Pascal: it’s non-obvious hallucination after non-obvious hallucination, presented confidently enough to make it difficult to see it’s complete garbage, so you waste a ton of time trying to polish a turd.
yet i’ve written a language using an LLM, of which there can be no prior knowledge since it’s new, and it can write that code just fine.
it’s all about context.
Creating a new paradigm is a problem with a lot more groundwork laid that working in an existing little-known paradigm. One is creating patterns which only have to be good to be correct. The other has to be correct to be good. They are completely different problems.