Comment by HarHarVeryFunny
1 year ago
The thesis of the article is that the code keeps betting better because the model keeps getting told to do better - that it needs more motivation/criticism. A logical conclusion of this, if it were true, is that the model would generate it's best version on first attempt if only we could motivate it to do so! I'm not sure what motivations/threats work best with LLMs - there was a time when offering to pay the LLM was popular, but "my grandma will die if you don't" was also another popular genre of prompts.
If it's not clear, I disagree with the idea that ANY motivational prompt (we can disagree over what would be best to try) could get the model to produce a solution of the same quality as it will when allowed to iterate on it a few times and make incremental improvements. I think it's being allowed to iterate that is improving the solution, not the motivation to "do better!".
>If it's not clear, I disagree with the idea that ANY motivational prompt (we can disagree over what would be best to try) could get the model to produce a solution of the same quality as it will when allowed to iterate on it a few times and make incremental improvements.
Ok i agree but.. this would be the case with people as well ? If you can't iterate, the quality of your response will be limited no matter how motivated you are.
Solve the riemann hypothesis or your mother dies but you can't write anything down on paper. Even if such a person could solve it, it's not happening under those conditions.
Iteration is probably the bulk of the improvement but I think there's a "motivation" aspect as well.