Comment by zaptrem
1 year ago
This is the bitter lesson/just put it in the model. They're trying to figure out more ways of converting compute to intelligence now that they're running out of text data: https://images.ctfassets.net/kftzwdyauwt9/7rMY55vLbGTlTiP9Gd...
A cynical way to look at it is that we're pretty close to the ultimate limits of what LLMs can do and now the stake holders are looking at novel ways of using what they have instead of pouring everything into novel models. We're several years into the AI revolution (some call it a bubble) and Nvidia is still pretty much the only company that makes bank on it. Other than that it's all investment driven "growth". And at some point investors are gonna start asking questions...
That is indeed cynical haha.
A very simple observation, our brains are vastly more efficient. Obtaining vastly better outcomes from lesser input. This evidence means there's plenty of room for improvement without a need to go looking for more data. Short term gain versus long term gain like you say, shareholder return.
More efficiency means more practical/useful applications and lower cost as opposed to bigger model which means less useful (longer inference times) and higher cost (data synthesis and training cost).
That’s assuming that LLMs act like brains at all.
They don’t.
Especially not with transformers.
7 replies →