Comment by CuriouslyC

4 hours ago

If you think scaling is all that matters, you need to learn more about ML.

Read about the the No Free Lunch Theorem. Basically, the reason we need to "scale" so hard is because we're building models that we want to be good at everything. We could build models that are as good at LLMs at a narrow fraction of tasks we ask of them to do, at probably 1/10th the parameters.

1 comment

CuriouslyC

never_inline 2 hours ago

Are reranker models an example of this? Do they still underperform compared to LLMs?