Comment by nickpsecurity

6 days ago

They mostly copy and mix patterns in the training data. Lots of repetition with variations on them is helpful for their generalization. Languages like Python and Go have a ton of code in Github, etc like that. I saw that using Python with GPT 3.5/4.

If it's a rarer language, the math doesn't do as good of a job on piles of random code. There's just not enough for it to learn from. I cant speak for Rust since I dont know the numbers but imagine it's much less than Python or Go.

I have seen some evidence, though, that harder languages are harder for them to code in. GPT 3.5 used to struggle with C++ for something that it could easily produce in Python. It could actually produce things in C more easily than C++. It makes sense, though, because there's both more context needed for correctness and more behavioral patterns to write it.

My solution, which I only prototyped in GPT due to leaving AI, was to use AI's to write code in languages like Python which non-AI tools transpiled to high-performance code in C++ or Rust. Think the Python to C++ compiler or maybe Nikita. Later, with hallucinations mitigated enough, add LLM's to those transpilers.

As a side benefit, it let's you sell a product accelerating or increasing predictability of applications in that language. That's a non-AI investment. There's a few companies doing that, too. So, one could sell to the AI crowd, the "language X in business" crowd, or both.