← Back to context

Comment by overfeed

6 months ago

> We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels

We do know what happens at higher abstraction levels; the design of efficient networks, and the steady beat of SOTA improvements all depend on understanding how LLMs work internally: choice of network dimensions, feature extraction, attention, attention heads, caching, the peculiarities of high-dimensions and avoiding overfitting are all well-understood by practitioners. Anthropomorphization is only necessary in pop-science articles that use a limited vocabulary.

IMO, there is very little mystery, but lots of deliberate mysticism, especially about future LLMs - the usual hype-cycle extrapolation.