Comment by d0mine
1 day ago
As I understand RL makes foundation models stupider (less capable, not more) but better at following instructions.
1 day ago
As I understand RL makes foundation models stupider (less capable, not more) but better at following instructions.
No comments yet
Contribute on Hacker News ↗