Comment by wongarsu

3 months ago

There are likely some languages that are genuinely easier or more difficult for LLMs.

For example consider Pascal or C89 requiring all variables to be declared at the start of the function body. That makes it much harder to generate code in a linear fashion. In Python you can just make up a variable the moment you decide you need it. In Pascal or C89 you would have to go back and change previous code, which LLMs can't easily do.

Similar things likely apply to strict typing. Typing makes it easier to reason about existing code, but it makes it harder to write new code if you don't have the ability to go back and change your mind on a type choice.

Both could be solved if we selected tokens in a beam search, searching for the path with the highest combined token probability instead of greedily selecting one token at a time. But that's much more expensive and I'm not sure anyone still does that with large-scale LLMs.

1 comment

wongarsu

pavlov 3 months ago

You could ask the LLM to first work out the solution in pseudocode, then translate to Pascal (or whatever). That way the variables are known after the initial pseudocode pass.

Human programmers also did this more frequently in those days than probably is the case now.