Comment by embedding-shape
16 hours ago
> I understand that deep learning is accelerated by GPUs but the concept of a transformer could have been used on much slower hardware much earlier
But they don't give the same results at those smaller scales. People imagined, but no one could have put into practice because the hardware wasn't there yet. Simplified, LLMs is basically Transformers with the additional idea of "and a shitton of data to learn from", and for making training feasible with that amount of data, you do need some capable hardware.
No comments yet
Contribute on Hacker News ↗