← Back to context

Comment by spacebanana7

5 days ago

> Presumably there is a lot more public info about, and code in Javascript and Python, hence this "preference"

This likely plays a major - probably dominant - role.

It's interesting to think of other factors too though. The relatively concise syntax of those languages might make them easier for LLMs to work with. If resources are in any way token limited then reading and writing Spring Boot apps is going to be burdensome.

Those languages also have a lot of single file applications, which might make them easier for LLMs to learn. So much of iOS development for example is split across many files and I wonder if that affects the quality of the training data.

Also worth considering: there's a wider range of "acceptable" output programs when dealing with such forgiving scripting languages. If asked to output C then there are loads of finicky bits it could mess up, pointer accesses, writing past the end of an array, using uninitialized memory, using a value it already freed, missing a free, etc. All things that the language runtime handles in Python or JS. There's a higher cognitive load it needs to take on.