Comment by kristianp

3 months ago

Have you heard of the bitter lesson? Bigger means better in Neural Networks.

1 comment

kristianp

Yeah. I know the bitter lesson.

For neutral networks, on one hand, larger size generally indicates higher performance upper limit. On the other hand, you really have to find ways to materialize these advantages over small models, or larger size becomes a burden.

However, I'm talking about local usage of LLMs instead of production usage, which is severely limited by GPUs with low VRAM. You literally cannot run LLMs beyond a specific size.