Comment by aaronblohowiak
2 days ago
We are either limited by compute, available training data, or algorithms. You seem to believe we are limited by compute. I've seen other people argue that we are limited by training data. It is my totally inexpert belief that we are substantially limited by algorithms at this point.
I think algorithms is a unique limit because it changes how much data or compute you need. For instance, we probably have the algorithms we need to brute force solving more problems today, but they require infeasible compute or data. We can almost certainly train a new 10T parameter mixture of experts that continues to make progress in benchmarks, but it will cost so much to train and be completely undeployable with today’s chips, data, and algorithms.
So I think the truth is likely we are both compute limited and we need better algorithms.
There are a few "hints" that suggest to me algorithms will bear a lot more fruit than compute (in terms of flops):
1) there already exist very efficient algorithms for rigorous problems that LLMs perform terribly at! 2) learning is too slow and is largely offline 3) "llms aren't world models"
General intelligence exists in this world, the inability to transfer it to a machine does seem like an algorithm problem. When it’s here we don’t even know if it will be an llm, no one knows the computer requirements.
We are limited by both compute and available training data.
If all we wanted was to train bigger and bigger models we have more than enough compute to last us for years.
Where we lack compute is in scaling the AI to consumers. Current models take too much power and specialized hardware to be be profitable. If AI was able to improve your productivity by 20-30% percent but it costed you even 10% of your monthly salary, none would use it. I have used up $10 worth of credits using claude code in an hour multiple times. Assuming I use it continuously for 8 hours every day in a month, 10 * 8 * 24 = $1920. So its not that far off the current costs or running the models. If the size of the models scales faster than the speed of the inference hardware, the problem is only going to get worse.
I too believe that we will eventually discover an algorithm that gives us AGI. The problem is that we cannot will a breakthrough. We can make one more likely by investing more and more into AI but breakthroughs and research in general by their nature are unpredictable.
I think investing in new individual ideas is very important and gives us lot of good returns. Investing in a field in general hoping to see a breakthrough is a fool's errand in my opinion.
If the LLM is multimodal would more video and images improve the quality of the textual output? There’s a ton of that and it’s always easy to get more.
I think we might also be limited by energy.