Comment by MagicMoonlight
5 days ago
They’re definitely RL training the models on the pelican test. They patch any kind of test that shows them performing poorly by hardcoding some answers into the model.
5 days ago
They’re definitely RL training the models on the pelican test. They patch any kind of test that shows them performing poorly by hardcoding some answers into the model.
No comments yet
Contribute on Hacker News ↗