← Back to context

Comment by thephyber

2 days ago

LLMs are terrible at generating code for “less commonly used languages”. They require LOTS of data for high accuracy.

I describe it this way: they are good at interpolating from what data they were trained on, but terrible at extrapolating. I agree with the parent that the LLM-generated content isn’t novel, it’s just a rehash of two things it was trained on.

I have wasted quite a number of hours trying to use LLMs to write things for less common languages. Sure they can one-shot some impressive stuff in C#, Python, and JavaScript… but try working in Object Pascal: it’s non-obvious hallucination after non-obvious hallucination, presented confidently enough to make it difficult to see it’s complete garbage, so you waste a ton of time trying to polish a turd.

  • yet i’ve written a language using an LLM, of which there can be no prior knowledge since it’s new, and it can write that code just fine.

    it’s all about context.

    • Creating a new paradigm is a problem with a lot more groundwork laid that working in an existing little-known paradigm. One is creating patterns which only have to be good to be correct. The other has to be correct to be good. They are completely different problems.