Comment by dcanelhas

3 months ago

> Once the new datacenters are up and running, they’ll be able to train a model with 10^28 FLOP—a thousand times more than GPT-4.

Is there some theoretical substance or empirical evidence to suggest that the story doesn't just end here? Perhaps OpenBrain sees no significant gains over the previous iteration and implodes under the financial pressure of exorbitant compute costs. I'm not rooting for an AI winter 2.0 but I fail to understand how people seem sure of the outcome of experiments that have not even been performed yet. Help, am I missing something here?

2 comments

dcanelhas

the8472 3 months ago

https://gwern.net/scaling-hypothesis exponential scaling has been holding up for more than a decade now, since alexnet.

And when there were the first murmurings that maybe we're finally hitting a wall the labs published ways to harness inference-time compute to get better results which can be fed back into more training.

dcanelhas 3 months ago

I sincerely appreciate the reply, but are you talking about Moore's law? Alexnet could run on a commercially available GPU in 2011(?). But that wasn't the peak compute platform being used at the time for DL inference, so it distorts the progress a bit. It's like me saying I was running a neural net on a raspberry pi yesterday for written character recognition on MNIST and today crunching stable diffusion on a GTX3090. Behold, a trillion-fold leap in just a day (nevermind the unrelated applications). The singularity is definitely gonna happen tomorrow!
But let's take for granted that we are putting exponential scaling to good use in terms of compute resources. It looks like we are seeing sublinear performance improvements on actual benchmarks[1]. Either way it seems optimistic at best to conclude that 1000x more compute would yield even 10x better results in most domains.
[1]fig.1 AI performance relative to human baseline. (https://hai.stanford.edu/ai-index/2025-ai-index-report)