Comment by simonw

3 months ago

Code is one of the few applications of LLMs where they DO have a mechanism for verifying if what they produced is correct: they can write code, run that code, look at the output and iterate in a loop until it does what it's supposed to do.

But that requires code that is runnable and testable in isolation otherwise there are all sorts issues with that approach (aside from the obvious one of scalability)

It also assumes they "understand" enough to be able to extract the correct output to test against.