Comment by mindwok
6 days ago
Not necessarily. If the RL objective is passing tests then in the context of LLMs it means "correct", or at least "correct based on the tests".
6 days ago
Not necessarily. If the RL objective is passing tests then in the context of LLMs it means "correct", or at least "correct based on the tests".
Unfortunately that doesn't solve the problem in any way. We don't have an Oracle machine for testing software.
If we did, we could autogenerate code even without an LLM.