Comment by epiccoleman

9 months ago

That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.

So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).

0 comments

epiccoleman

No comments yet

Contribute on Hacker News ↗