Comment by Insanity
20 hours ago
The results from a smaller model are still viable if the paradigm is identical. Unless you believe that larger volumes of data leads to more (unexplained) emergent properties of the AI. i.e, if you think that a larger volume of training data somehow means the model develops actual reasoning skills, beyond the normal next-token prediction.
I do think that larger models will perform better, but not because they fundamentally work differently than the smaller models, and thus the idea behind TFA still stands (in my opinion).
No comments yet
Contribute on Hacker News ↗