Comment by me551ah
2 days ago
Or maybe not. Scaling AI will require an exponential increase in compute and processing power, and even the current LLM models take up a lot of resources. We are already at the limit of how small we can scale chips and Moore’s law is already dead.
So newer chips will not be exponentially better but will be more of incremental improvements, so unless the price of electricity comes down exponentially we might never see AGI at a price point that’s cheaper than hiring a human.
Most companies are already running AI models at a loss, scaling the models to be bigger(like GPT 4.5) only makes them more expensive to run.
The reason why internet, smartphones and computers have seen exponential growth from the 90s is due to underlying increase in computing power. I personally used a 50Mhz 486 in the 90s and now use a 8c/16t 5Ghz CPU. I highly doubt if we will see the same form of increase in the next 40 years
Scaling AI will require an exponential increase in compute and processing power,
A small quibble... I'd say that's true only if you accept as an axiom that current approaches to AI are "the" approach and reject the possibility of radical algorithmic advances that completely change the game. For my part, I have a strongly held belief that there is such an algorithmic advancement "out there" waiting to be discovered, that will enable AI at current "intelligence" levels, if not outright Strong AI / AGI, without the absurd demands on computational resources and energy. I can't prove that of course, but I take the existence of the human brain as an existence proof that some kind of machine can provide human level intelligence without needing gigawatts of power and massive datacenters filled with racks of GPU's.
Deepmind where experimenting with this https://github.com/google-deepmind/lab a few years ago.
Having AI agents learn to see, navigate and complete tasks in a 3d environment. I feel like it had more potential than LLMs to become an AGI (if that is possible).
They haven't touched it in a long time though. But Genie 3 makes me think they haven't completely dropped it.
If we suppose that ANNs are more or less accurate models of real neural networks, the reason why they're so inefficient is not algorithmic, but purely architectural. They're just software. We have these huge tables of numbers and we're trying to squeeze them as hard as possible through a relatively small number of multipliers and adders. Meanwhile, a brain can perform a trillion fundamental simultaneously because every neuron is a complete processing element independent of every other one. To bring that back into more concrete terms, if we took an arbitrary model and turned it into a bespoke piece of hardware, it would certainly be at least one or two orders of magnitude faster and more efficient, with the downside that since it's dead silicon it could not be changed and iterated on.
If you account for the fact that biological neurons operate at a much lower frequency than silicon processors, then the raw performance gets much closer. From what I can find, neuron membrane time constant is around 10ms [1], meaning 10 billion neurons could have 1 trillion activations per second, which is in the realm of modern hardware.
People mentioned in [2] have done the calculations from a more informed position than I have, and reach numbers like 10^17 FLOPS when doing a calculation that resembles this one.
[1] https://spectrum.ieee.org/fast-efficient-neural-networks-cop...
[2] https://aiimpacts.org/brain-performance-in-flops/
The energy inefficiency of ANNs vs our brain is mostly because our brain operates in async dataflow mode with each neuron mostly consuming energy only when it fires. If a neuron's inputs haven't changed then it doesn't redundantly "recalculate it's output" like an ANN - it just does nothing.
You could certainly implement an async dataflow type design in software, although maybe not as power efficiently as with custom silicon, but individual ANN node throughput performance would suffer given the need to aggregate neurons needing updates into a group to be fed into one the large matrix multiplies that today's hardware is optimized for, although sparse operations are also a possibility. OTOH conceivably one could save enough FLOPs that it'd still be a win in terms of how fast an input could be processed through an entire neural net.
the reason why they're so inefficient is not algorithmic, but purely architectural.
I would agree with that, with the caveat that in my mind "the architecture" and "the algorithm" are sort of bound up with each other. That is, one implies the other -- to some extent.
And yes, fair point that building dedicated hardware might just be part of the solution to making something that runs much more efficiently.
The only other thing I would add, is that - relative to what I said in the post above - when I talk about "algorithmic advances" I'm looking at everything as potentially being on the table - including maybe something different from ANN's altogether.
> If we suppose that ANNs are more or less accurate models of real neural networks
i believe the problem is we don't understand actual neurons let alone actual networks of neurons to even know if any model is accurate or not. The AI folks cleverly named their data structures "neuron" and "neural network" to make it seem like we do.
2 replies →
> If we suppose that ANNs are more or less accurate models of real neural networks [..]
IANNs were inspired by biological neural structures and that's it. They are not representative models at all, even of the "less" variety. Dedicated hardware will certainly help, but no insights into how much it can help will come from this sort of comparison.
9 replies →
> Scaling AI will require an exponential increase in compute and processing power,
I think there is something more happening with AI scaling; I think the scaling factor per user is a lot higher and a lot more expensive. Compare to the big initial internet companies. You added one server you could handle thousands more users; incremental cost was very low, not to mention the revenue captured through whatever adtech means. Not so with AI workloads; they are so much more expensive than ad revenue it's hard to break even even with an actual paid subscription.
I dont fully even get why; inference costs are way lower than training costs no?
We know for a fact that human level general intelligence can be achieved on a relatively modest power budget. A human brain runs on somewhere from about 20-100W, depending on how much of the rest of the body's metabolism you attribute to supporting it.
The fact that the human brain, heck all brains, are so much more efficient than “state of the art” nnets, in terms of architecture, power consumption, training cost, what have you … while also being way more versatile and robust … is what convinces me that this is not the path that leads to AGI.
> We are already at the limit of how small we can scale chips
I strongly suspect this is not true for LLMs. Once progress stabilizes, doing things like embedding the weights of some model directly as part of the chip will suddenly become economical, and that's going to cut costs down dramatically.
Then there's distillation, which basically makes smaller models get better as bigger models get better. You don't necessarily need to run a big model al of the time to reap its benefits.
> so unless the price of electricity comes down exponentially
This is more likely than you think. AI is extremely bandwidth-efficient and not too latency-sensitive (unlike e.g. Netflix et al), so it's pretty trivial to offload AI work to places where electricity is abundant and power generation is lightly regulated.
> Most companies are already running AI models at a loss, scaling the models to be bigger(like GPT 4.5) only makes them more expensive to run.
"We're profitable on inference. If we didn't pay for training, we'd be a very profitable company." Sam Altman, OpenAI CEO[1].
[1] https://www.axios.com/2025/08/15/sam-altman-gpt5-launch-chat...
>doing things like embedding the weights of some model directly as part of the chip will suddenly become economical, and that's going to cut costs down dramatically.
An implementation of inference on some specific ANN in fixed function analog hardware can probably pretty easily beat a commodity GPU by a couple orders of magnitude in perf per watt too.
> "We're profitable on inference. If we didn't pay for training, we'd be a very profitable company."
That's OpenAI (though I'd be curious if that statement holds for subscriptions as opposed to API use). What about the downstream companies that use OpenAI models? I'm not sure the picture is as rosy for them.
[dead]