Comment by ACCount37
3 days ago
That's what I'm saying: there is no "direct comparison grounded in neurobiology" for most things, and for many things, there simply can't be one. For the same reason you can't compare gears and springs to silicon circuits 1:1. The low level components diverge too much.
Despite all that, the calculator and the arithmometer do the same things. If you can't go up an abstraction level and look past low level implementation details, then you'll remain blind to that fact forever.
What papers depends on what you're interested in. There's a lot of research - ranging from weird LLM capabilities and to exact operation of reverse engineered circuits.
There is no level of abstraction to go up sans context. Again, let me repeat myself as well: the calculator and the arithmometer do the same things -- from the point of view of the cleric that needs to add and subtract quickly. Otherwise they are simply two completely different objects. And we will have a hard time making correct inferences about how one works based only on how we know the other works, or, e.g. how calculating machines work.
What I'm interested in is evidence that supports that "The more you try to look into the LLM internals, the more similarities you find". Some pointers to specific books and papers will be very helpful.
> Otherwise they are simply two completely different objects.
That's where you're wrong. Both objects reflect the same mathematical operations in their structure.
Even if those were inscrutable alien artifacts to you, even if you knew nothing about who constructed them, how or why? If you studied them, you would be able to see the similarities laid bare.
Their inputs align, their outputs align. And if you dug deep enough? You would find that there are components in them that correspond to the same mathematical operations - even if the two are nothing alike in how exactly they implement them.
LLMs and human brains are "inscrutable alien artifacts" to us. Both are created by inhuman optimization pressures. Both you need to study to find out how they function. It's obvious, though, that their inputs align, and their outputs align. And the more you dig into internals?
I recommend taking a look at Anthropic's papers on SAE - sparse autoencoders. Which is a method that essentially takes the population coding hypothesis and runs with it. It attempts to crack the neural coding used by the LLM internally to pry interpretable features out of it. There are no "grandmother neurons" there - so you need elaborate methods to examine what kind of representations an LLM can learn to recognize and use in its functioning.
Anthropic's work is notable because they have not only managed to extract features that map to some amazingly high level concepts, but also prove causality - interfering with the neuron populations mapped out by SAE changes LLM's behaviors in predictable ways.
You are making the false assumption that if output can be inferred from structure, the converse is true as well. Similarity in behaviour does not in any way, shape or form imply structural similarity. The boy scout, the migrating swallow, the foraging bee, and the mobile robot are good at orienteering. Do they achieve this goal in a similar manner? Not really.
Re: "I'm baffled that someone in CS, a field ruled by applied abstraction, has to be explained over and over again that abstraction is a thing that exists". Computer science deals with models of computation. You are making a classic mistake in confusing models for the real things they are capable of modelling.
> That's where you're wrong. Both objects reflect the same mathematical operations in their structure.
This is missing the point by a country mile, I think.
All navel-gazing aside, understanding every bit of how an arithmometer works - hell, even being able to build one yourself - tells you absolutely nothing about how the Z80 chip in a TI-83 calculator actually works. Even if you take it down to individual components, there is zero real similarity between how a Leibniz wheel works and how a (full) adder circuit works. They are in fact fundamentally different machines that operate via fundamentally different principles.
The idea that similar functions must mean that they share significant similarities under the hood is senseless; you might as well argue that there are similarities to be found between a nuclear chain reaction and the flow of a river because they are both harnessed to spin turbines to generate electricity. It is a profoundly and quite frankly disturbingly incurious way for anyone who considers themself an "engineer" to approach the world.
3 replies →