Comment by river_otter
3 days ago
One thing from the podcast that jumped out to me was the statement that in pre training "you don't have to think closely about the data". Like I guess the success of pre training supports the point somewhat but it feels to me slightly opposed to Karpathy talking about what a large percentage of pretraining data is complete garbage. I guess I would hope that more work in cleaning the pre training data would result in stronger and more coherent base models.
No comments yet
Contribute on Hacker News ↗