Comment by slashdave
1 day ago
RL is more than facts. Synthetic feedback is an obvious approach. Does the model suggest code that compiles and performs well?
1 day ago
RL is more than facts. Synthetic feedback is an obvious approach. Does the model suggest code that compiles and performs well?
No comments yet
Contribute on Hacker News ↗