Comment by throwup238
1 year ago
> The 3D map covers a volume of about one cubic millimetre, one-millionth of a whole brain, and contains roughly 57,000 cells and 150 million synapses — the connections between neurons.
This is great and provides a hard data point for some napkin math on how big a neural network model would have to be to emulate the human brain. 150 million synapses / 57,000 neurons is an average of 2,632 synapses per neuron. The adult human brain has 100 (+- 20) billion or 1e11 neurons so assuming the average rate of synapse/neuron holds, that's 2.6e14 total synapses.
Assuming 1 parameter per synapse, that'd make the minimum viable model several hundred times larger than state of the art GPT4 (according to the rumored 1.8e12 parameters). I don't think that's granular enough and we'd need to assume 10-100 ion channels per synapse and I think at least 10 parameters per ion channel, putting the number closer to 2.6e16+ parameters, or 4+ orders of magnitude bigger than GPT4.
There are other problems of course like implementing neuroplasticity, but it's a fun ball park calculation. Computing power should get there around 2048: https://news.ycombinator.com/item?id=38919548
Or you can subscribe to Geoffrey Hinton's view that artificial neural networks are actually much more efficient than real ones- more or less the opposite of what we've believed for decades- that is that artificial neurons were just a poor model of the real thing.
Quote:
"Large language models are made from massive neural networks with vast numbers of connections. But they are tiny compared with the brain. “Our brains have 100 trillion connections,” says Hinton. “Large language models have up to half a trillion, a trillion at most. Yet GPT-4 knows hundreds of times more than any one person does. So maybe it’s actually got a much better learning algorithm than us.”
GPT-4's connections at the density of this brain sample would occupy a volume of 5 cubic centimeters; that is, 1% of a human cortex. And yet GPT-4 is able to speak more or less fluently about 80 languages, translate, write code, imitate the writing styles of hundreds, maybe thousands of authors, converse about stuff ranging from philosophy to cooking, to science, to the law.
"Efficient" and "better" are very different descriptors of a learning algorithm.
The human brain does what it does using about 20W. LLM power usage is somewhat unfavourable compared to that.
You mean energy-efficient, this would be neuron, or synapse-efficient.
12 replies →
It is using about 20W and then a person takes a single airplane ride between the coasts. And watches a movie on the way.
I mean, Hinton’s premises are, if not quite clearly wrong, entirely speculative (which doesn't invalidate the conclusions about efficienct that they are offered to support, but does leave them without support) GPT-4 can produce convincing written text about a wider array of topics than any one person can, because it's a model optimized for taking in and producing convincing written text, trained extensively on written text.
Humans know a lot of things that are not revealed by inputs and outputs of written text (or imagery), and GPT-4 doesn't have any indication of this physical, performance-revealed knowledge, so even if we view what GPT-4 talks convincingly about as “knowledge”, trying to compare its knowledge in the domains it operates in with any human’s knowledge which is far more multimodal is... well, there's no good metric for it.
Try asking an LLM about something which is semantically patently ridiculous, but lexically superficially similar to something in its training set, like "the benefits of laser eye removal surgery" or "a climbing trip to the Mid-Atlantic Mountain Range".
Ironically, I suppose part of the apparent "intelligence" of LLMs comes from reflecting the intelligence of human users back at us. As a human, the prompts you provide an LLM likely "make sense" on some level, so the statistically generated continuations of your prompts are likelier to "make sense" as well. But if you don't provide an ongoing anchor to reality within your own prompts, then the outputs make it more apparent that the LLM is simply regurgitating words which it does not/cannot understand.
On your point of human knowledge being far more multimodal than LLM interfaces, I'll add that humans also have special neurological structures to handle self-awareness, sensory inputs, social awareness, memory, persistent intention, motor control, neuroplasticity/learning– Any number of such traits, which are easy to take for granted, but indisputably fundamental parts of human intelligence. These abilities aren't just emergent properties of the total number of neurons; they live in special hardware like mirror neurons, special brain regions, and spindle neurons. A brain cell in your cerebellum is not generally interchangeable with a cell in your visual or frontal cortices.
So when a human "converse[s] about stuff ranging from philosophy to cooking" in an honest way, we (ideally) do that as an expression of our entire internal state. But GPT-4 structurally does not have those parts, despite being able to output words as if it might, so as you say, it "generates" convincing text only because it's optimized for producing convincing text.
I think LLMs may well be some kind of an adversarial attack on our own language faculties. We use words to express ourselves, and we take for granted that our words usually reflect an intelligent internal state, so we instinctively assume that anything else which is able to assemble words must also be "intelligent". But that's not necessarily the case. You can have extremely complex external behaviors that appear intelligent or intentioned without actually internally being so.
13 replies →
> Humans know a lot of things that are not revealed by inputs and outputs of written text (or imagery), and GPT-4 doesn't have any indication of this physical, performance-revealed knowledge, so even if we view what GPT-4 talks convincingly about as “knowledge”, trying to compare its knowledge in the domains it operates in with any human’s knowledge which is far more multimodal is... well, there's no good metric for it.
Exactly this.
Anyone that has spent significant time golfing can think of an enormous amount of detail related to the swing and body dynamics and the million different ways the swing can go wrong.
I wonder how big the model would need to be to duplicate an average golfers score if playing X times per year and the ability to adapt to all of the different environmental conditions encountered.
Hinton is way off IMO. Amount of examples needed to teach language to an LLM is many orders of magnitude more than humans require. Not to mention power consumption and inelasticity.
I think that what Hinton is saying is that, in his opinion, if you fed a 1/100th of a human cortex with the amount of data that is used to train llms, you wouldn't get a thing that can speak in 80 different languages about a gigantic number of subjects, but (I'm interpreting here..) about ten of grams of fried, fuming organic matter.
This doesn't mean that an entire human brain doesn't surpass llms in many different ways, only that artificial neural networks appear to be able to absorb and process more information per neuron than we do.
LLM does not know math as well as a professor, judging from the large number of false functional analysis proofs I have had it generate will trying to learn functional analysis. In fact the thing it seems to lack is what makes a proof true vs. fallacious, as well as a tendency to answer false questions. “How would you prove this incorrectly transcribed problem” will get fourteen steps with 8 and 12 obviously (to a student) wrong, while the professor will step back and ask what am I trying to prove.
LLMs do not know math, at all. Not to sound like one myself, but they are stochastic parrots, and they output stuff similar to their training data, but they have no understanding of the meaning of things beyond vector encodings. This is why chatgpt plays chess in hilarious ways also.
An LLM cannot possibly have any concept of even what a proof is, much less whether it is true or not, even if we're not talking about math. The lower training data amount and the fact that math uses tokens that are largely field-specific, as well as the fact that a single-token error is fatal to truth in math means even output that resembles training data is unlikely to be close to factual.
1 reply →
> "So maybe it’s actually got a much better learning algorithm than us.”
And yet somehow it's also infinitely less useful than a normal person is.
GPT4 has been a lot more useful to me than most normal people I interact with.
Except you’d be missing the part that a neuron is not just a node with a number but a computational system itself.
Computation is really integrated through every scale of cellular systems. Individual proteins are capable of basic computation which are then integrated into regulatory circuits, epigenetics, and cellular behavior.
Pdf: “Protein molecules as computational elements in living cells - Dennis Bray” https://www.cs.jhu.edu/~basu/Papers/Bray-Protein%20Computing...
I think you are missing the point.
The calculation is intentionally underestimating the neurons, and even with that the brain ends up having more parameters than the current largest models by orders of magnitude.
Yes the estimation is intentionally modelling the neurons simpler than they are likely to be. No, it is not “missing” anything.
The point is to make a ballpark estimate, or at least to estimate the order of magnitude.
From the sibling comment:
> Individual proteins are capable of basic computation which are then integrated into regulatory circuits, epigenetics, and cellular behavior.
If this is true, then there may be many orders of magnitude unaccounted for.
Imagine if our intelligent thought actually depends irreducibly on the complex interactions of proteins bumping into each other in solution. It would mean computers would never be able to play the same game.
1 reply →
That may or may not still be too simple a model. Cells are full of complex nano scale machinery and not only might it me plausible some of it is involved in the processes of cognition, I'm aware of at least one study which identified some nano scale structures directly involved in how memory works in neurones. Not to mention a lot of what's happening has a fairly analogue dimension.
I remember an interview with one neurologist who stated humanity has for centuries compared the functioning of the brain to the most complex technology devised yet. First it was compared to mechanical devices, then pipes and steam, then electrical circuits, then electronics and now finally computers. But he pointed out, the brain works like none of these things so we have to be aware of the limitations of our models.
> That may or may not still be too simple a model
Based on the stuff I've read, it's almost for sure too simple a model.
One example is that single dendrites detect patterns of synaptic activity (sequences over time) which results in calcium signaling within the neuron and altered spiking.
There's a lot of in-neuron complexity, I'm sure there is some cross-synapse signaling (I mean, how can it not exist? There's nothing stopping it.), and I don't think the synapse behavior can be modeled as just more signals.
On the other hand, a significant amount of neural circuitry seems to be dedicated to "housekeeping" needs, and to functions such as locomotion.
So we might need significantly less brain matter for general intelligence.
Or perhaps the housekeeping of existing in the physical world is a key aspect of general intelligence.
Isn't that kinda obvious? A baby that grows up in a sensory deprivation tank does not… develop, as most intelligent persons do.
6 replies →
Yes and no on order of magnitude required for decent AI, there is still (that I know of) very little hard data on info density in the human brain. What there is points at entire sections that can sometimes be destroyed or actively removed while conserving "general intelligence".
Rather than "humbling" I think the result is very encouraging: It points at major imaging / modeling progress, and it gives hard numbers on a very efficient (power-wise, size overall) and inefficient (at cable management and probably redundancy and permanence, etc) intelligence implementation. The numbers are large but might be pretty solid.
Don't know about upload though...
> Computing power should get there around 2048
We may not get there. Doing some more back of the envelope calculations, let's see how much further we can take silicon.
Currently, TSMC has a 3nm chip. Let's halve it until we get to the atomic radius of silicon of 0.132 nm. That's not a good value because we're not considering crystal latice distances, Heisenberg uncertainty, etc., but it sets a lower bound. 3nm -> 1.5nm -> 0.75 nm -> 0.375nm -> 0.1875nm. There is no way we can get past 3 more generations using Silicon. There's a max of 4.5 years of Moore's law we're going to be able to squeeze out. That means we will not make it past 2030 with these kind of improvements.
I'd love to be shown how wrong I am about this, but I think we're entering the horizontal portion of the sigmoidal curve of exponential computational growth.
3nm doesn’t mean the transistor is 3nm, it’s just a marketing naming system at this point. The actual transistor is about 20-30nm or so.
Thanks for the comment. I looked more into this and it seems like not only are we in the era of diminished returns for computational abilities, costs have also now started matching the increased compute. i.e 2x performance leads to 2x cost. Moore's law has already run it's course and we're living in a new era of compute. We may get increased performance, but it will always be more expensive.
Artificial thinking doesn't require an artificial brain. As our own walking system, compared to our car's locomotion system.
The car's engine, transmission and wheels, require no muscles or nerves