Comment by svachalek
9 days ago
This is an interesting paper, it postulates that the ability of an LLM to perform tasks correlates mostly to the number of layers it has, and that reasoning creates virtual layers in the context space. https://arxiv.org/abs/2412.02975
No comments yet
Contribute on Hacker News ↗