I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.
The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.
We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.
I kinda agree with both of you. It might be a required abstraction, but it's a leaky one.
Long before LLMs, I would talk about classes / functions / modules like "it then does this, decides the epsilon is too low, chops it up and adds it to the list".
The difference I guess it was only to a technical crowd and nobody would mistake this for anything it wasn't. Everybody know that "it" didn't "decide" anything.
With AI being so mainstream and the math being much more elusive than a simple if..then I guess it's just too easy to take this simple speaking convention at face value.
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.
To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.
Well "reasoning" refers to Chain-of-Thought and if you look at the generated prompts it's not hard to see why it's called that.
That said, it's fascinating to me that it works (and empirically, it does work; a reasoning model generating tens of thousands of tokens while working out the problem does produce better results). I wish I knew why. A priori I wouldn't have expected it, since there's no new input. That means it's all "in there" in the weights already. I don't see why it couldn't just one shot it without all the reasoning. And maybe the future will bring us more distilled models that can do that, or they can tease out all that reasoning with more generated training data, to move it from dispersed around the weights -> prompt -> more immediately accessible in the weights. But for now "reasoning" works.
But then, at the back of my mind is the easy answer: maybe you can't optimize it. Maybe the model has to "reason" to "organize its thoughts" and get the best results. After all, if you give me a complicated problem I'll write down hypotheses and outline approaches and double check results for consistency and all that. But now we're getting dangerously close to the "anthropomorphization" that this article is lamenting.
"All models are wrong, but some models are useful," is the principle I have been using to decide when to go with an anthropomorphic explanation.
In other words, no, they never accurately describe what the LLM is actually doing. But sometimes drawing an analogy to human behavior is the most effective way to pump others' intuition about a particular LLM behavior. The trick is making sure that your audience understands that this is just an analogy, and that it has its limitations.
And it's not completely wrong. Mimicking human behavior is exactly what they're designed to do. You just need to keep reminding people that it's only doing so in a very superficial and spotty way. There's absolutely no basis for assuming that what's happening on the inside is the same.
It's not just distorting discussions it's leading people to put a lot of faith in what LLMs are telling them. Was just on a zoom an hour ago where a guy working on a startup asked ChatGPT about his idea and then emailed us the result for discussion in the meeting. ChatGPT basically just told him what he wanted to hear - essentially that his idea was great and it would be successful ("if you implement it correctly" was doing a lot of work). It was a glowing endorsement of the idea that made the guy think that he must have a million dollar idea. I had to be "that guy" who said that maybe ChatGPT was telling him what he wanted to hear based on the way the question was formulated - tried to be very diplomatic about it and maybe I was a bit too diplomatic because it didn't shake his faith in what ChatGPT had told him.
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
Do you believe thinking/reasoning is a binary concept? If not, do you think the current top LLM are before or after the 50% mark? What % do you think they're at? What % range do you think humans exhibit?
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
With such strong wording, it should be rather easy to explain how our thinking differs from what LLMs do. The next step - showing that what LLMs do precludes any kind of sentience is probably much harder.
I think it's worth distinguishing between the use of anthropomorphism as a useful abstraction and the misuse by companies to fuel AI hype.
For example, I think "chain of thought" is a good name for what it denotes. It makes the concept easy to understand and discuss, and a non-antropomorphized name would be unnatural and unnecessarily complicate things. This doesn't mean that I support companies insisting that LLMs think just like humans or anything like that.
By the way, I would say actually anti-anthropomorphism has been a bigger problem for understanding LLMs than anthropomorphism itself. The main proponents of anti-anthropomorphism (e.g. Bender and the rest of "stochastic parrot" and related paper authors) came up with a lot of predictions about things that LLMs surely couldn't do (on account of just being predictors of the next word, etc.) which turned out to be spectacularly wrong.
I thought this too but then began to think about it from the perspective of the programmers trying to make it imitate human learning. That's what a nn is trying to do at the end of the day, and in the same way I train myself by reading problems and solutions, or learning vocab at a young age, it does so by tuning billions of parameters.
I think these models do learn similarly. What does it even mean to reason? Your brain knows certain things so it comes to certain conclusions, but it only knows those things because it was ''trained'' on those things.
I reason my car will crash if I go 120 mph on the other side of the road because previously I have 'seen' where the input is a car going 120mph has a high probability of producing a crash, and similarly have seen input where the car is going on the other side of the road, producing a crash. Combining the two would tell me it's a high probability.
>> it pointless to *not* anthropomorphize, at least to an extent.
I agree that it is pointless to not anthropomorphize because we are humans and we will automatically do this. Willingly or unwillingly.
On the other hand, it generates bias. This bias can lead to errors.
So the real answer is (imo) that it is fine to anthropomorphise but recognize that while doing so can provide utility and help us understand, it is WRONG. Recognizing that it is not right and cannot be right provides us with a constant reminder to reevaluate. Use it, but double check, and keep checking making sure you understand the limitations of the analogy. Understanding when and where it applies, where it doesn't, and most importantly, where you don't know if it does or does not. The last is most important because it helps us form hypotheses that are likely to be testable (likely, not always. Also, much easier said than done).
So I pick a "grey area". Anthropomorphization is a tool that can be helpful. But like any tool, it isn't universal. There is no "one-size-fits-all" tool. Literally, one of the most important things for any scientist is to become an expert at the tools you use. It's one of the most critical skills of *any expert*. So while I agree with you that we should be careful of anthropomorphization, I disagree that it is useless and can never provide information. But I do agree that quite frequently, the wrong tool is used for the right job. Sometimes, hacking it just isn't good enough.
> On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs
I hold a deep belief that anthropomorphism is a way the human mind words. If we take for granted the hypothesis of Franz de Waal, that human mind developed its capabilities due to political games, and then think about how it could later lead to solving engineering and technological problems, then the tendency of people to anthropomorphize becomes obvious. Political games need empathy or maybe some other kind of -pathy, that allows politicians to guess motives of others looking at their behaviors. Political games directed the evolution to develop mental instruments to uncover causality by watching at others and interacting with them. Now, to apply these instruments to inanimate world all you need is to anthropomorphize inanimate objects.
Of course, it leads sometimes to the invention of gods, or spirits, or other imaginary intelligences behinds things. And sometimes these entities get in the way of revealing the real causes of events. But I believe that to anthropomorphize LLMs (at the current stage of their development) is not just the natural thing for people but a good thing as well. Some behavior of LLMs is easily described in terms of psychology; some cannot be described or at least not so easy. People are seeking ways to do it. Projecting this process into the future, I can imagine how there will be a kind of consensual LLMs "theory" that explains some traits of LLMs in terms of human psychology and fails to explain other traits, so they are explained in some other terms... And then a revolution happens, when a few bright minds come and say that "anthropomorphism is bad, it cannot explain LLM" and they propose something different.
I'm sure it will happen at some point in the future, but not right now. And it will happen not like that: not just because someone said that anthropomorphism is bad, but because they proposed another way to talk about reasons behind LLMs behavior. It is like with scientific theories: they do not fail because they become obviously wrong, but because other, better theories replace them.
It doesn't mean, that there is no point to fight anthropomorphism right now, but this fight should be directed at searching for new ways to talk about LLMs, not to show at the deficiencies of anthropomorphism. To my mind it makes sense to start not with deficiencies of anthropomorphism but with its successes. What traits of LLMs it allows us to capture, which ideas about LLMs are impossible to wrap into words without thinking of LLMs as of people?
The "point" of not anthropomorphizing is to refrain from judgement until a more solid abstraction appears. The problem with explaining LLMs in terms of human behaviour is that, while we don't clearly understand what the LLM is doing, we understand human cognition even less! There is literally no predictive power in the abstraction "The LLM is thinking like I am thinking". It gives you no mechanism to evaluate what tasks the LLM "should" be able to do.
Seriously, try it. Why don't LLMs get frustrated with you if you ask them the same question repeatedly? A human would. Why are LLMs so happy to give contradictory answers, as long as you are very careful not to highlight the contradictory facts? Why do earlier models behave worse on reasoning tasks than later ones? These are features nobody, anywhere understands. So why make the (imo phenomenally large) leap to "well, it's clearly just a brain"?
It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
> Why don't LLMs get frustrated with you if you ask them the same question repeatedly?
To be fair, I have had a strong sense of Gemini in particular becoming a lot more frustrated with me than GPT or Claude.
Yesterday I had it ensuring me that it was doing a great job, it was just me not understanding the challenge but it would break it down step by step just to make it obvious to me (only to repeat the same errors, but still)
I’ve just interpreted it as me reacting to the lower amount of sycophancy for now
> It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
We tried to mimic birds at first; it turns out birds were way too high-tech, and too optimized. We figured out how to fly when we ditched the biological distraction and focused on flight itself. But fast forward until today, we're reaching the level of technology that allows us to build machines that fly the same way birds do - and of such machines, it's fair to say, "it's a mechanical bird!".
Similarly, we cracked computing from grounds up. Babbage's difference engine was like da Vinci's drawings; ENIAC could be seen as Wright brothers' first flight.
With planes, we kept iterating - developing propellers, then jet engines, ramjets; we learned to move tons of cargo around the world, and travel at high multiples of the speed of sound. All that makes our flying machines way beyond anything nature ever produced, when compared along those narrow dimensions.
The same was true with computing: our machines and algorithms very quickly started to exceed what even smartest humans are capable of. Counting. Pathfinding. Remembering. Simulating and predicting. Reproducing data. And so on.
But much like birds were too high-tech for us to reproduce until now, so were general-purpose thinking machines. Now that we figured out a way to make a basic one, it's absolutely fair to say, "I guess it's like a digital mind".
Agreed. I'm also in favor of anthropomorphizing, because not doing so confuses people about the nature and capabilities of these models even more.
Whether it's hallucinations, prompt injections, various other security vulnerabilities/scenarios, or problems with doing math, backtracking, getting confused - there's a steady supply of "problems" that some people are surprised to discover and even more surprised this isn't being definitively fixed. Thing is, none of that is surprising, and these things are not bugs, they're flip side of the features - but to see that, one has to realize that humans demonstrate those exact same failure modes.
Especially when it comes to designing larger systems incorporating LLM "agents", it really helps to think of them as humans - because the problems those systems face are exactly the same as you get with systems incorporating people, and mostly for the same underlying reasons. Anthropomorphizing LLMs cuts through a lot of misconceptions and false paths, and helps one realize that we have millennia of experience with people-centric computing systems (aka. bureaucracy) that's directly transferrable.
I disagree. Anthropomorphization can be a very useful tool but I think it is currently over used and is a very tricky tool to use when communicating with a more general audience.
I think looking at physics might be a good example. We love our simplified examples and there's a big culture of trying to explain things to the lay person (mostly because the topics are incredibly complex). But how many people have misunderstood an observer of a quantum event with "a human" and do not consider "a photon" as an observer? How many people think in Schrodinger's Cat that the cat is both alive and dead?[0] Or believe in a multiverse. There's plenty of examples we can point to.
While these analogies *can* be extremely helpful, they *can* also be extremely harmful. This is especially true as information is usually passed through a game of telephone[1]. There is information loss and with it, interpretation becomes more difficult. Often a very subtle part can make a critical distinction.
I'm not against anthropomorphization[2], but I do think we should be cautious about how we use it. The imprecise nature of it is the exact reason we should be mindful of when and how to use it. We know that the anthropomorphized analogy is wrong. So we have to think about "how wrong" it is for a given setting. We should also be careful to think about how it may be misinterpreted. That's all I'm trying to say. And isn't this what we should be doing if we want to communicate effectively?
[0] It is not. It is either. The point of this thought experiment is that we cannot know the answer without looking inside. There is information loss and the event is not deterministic. It directly relates to the Heisenberg Uncertainty Principle, Godel's Incompleteness, or the Halting Problem. All these things are (loosely) related around the inability to have absolute determinism.
I remember Dawkins talking about the "intentional stance" when discussing genes in The Selfish Gene.
It's flat wrong to describe genes as having any agency. However it's a useful and easily understood shorthand to describe them in that way rather than every time use the full formulation of "organisms who tend to possess these genes tend towards these behaviours."
Sometimes to help our brains reach a higher level of abstraction, once we understand the low level of abstraction we should stop talking and thinking at that level.
The intentional stance was Daniel Dennett's creation and a major part of his life's work. There are actually (exactly) three stances in his model: the physical stance, the design stance, and the intentional stance.
I get the impression after using language models for quite a while that perhaps the one thing that is riskiest to anthropomorphise is the conversational UI that has become the default for many people.
A lot of the issues I'd have when 'pretending' to have a conversation are much less so when I either keep things to a single Q/A pairing, or at the very least heavily edit/prune the conversation history. Based on my understanding of LLM's, this seems to make sense even for the models that are trained for conversational interfaces.
so, for example, an exchange with multiple messages, where at the end I ask the LLM to double-check the conversation and correct 'hallucinations', is less optimal than something like asking for a thorough summary at the end, and then feeding that into a new prompt/conversation, as the repetition of these falsities, or 'building' on them with subsequent messages, is more likely to make them a stronger 'presence' and as a result perhaps affect the corrections.
I haven't tested any of this thoroughly, but at least with code I've definitely noticed how a wrong piece of code can 'infect' the conversation.
If I use human-related terminology as a shortcut, as some kind of macro to talk at a higher level/more efficiently about something I want to do that might be okay.
What is not okay is talking in a way that implies intent, for example.
Compare:
"The AI doesn't want to do that."
versus
"The model doesn't do that with this prompt and all others we tried."
The latter way of talking is still high-level enough but avoids equating/confusing the name of a field with a sentient being.
Whenever I hear people saying "an AI" I suggest they replace AI with "statistics" to make it obvious how problematic anthropomorphisms may have become:
The only reason that sounds weird to you is because you have the experience of being human. Human behavior is not magic. It's still just statistics. You go to the bathroom when you have to pee not because some magical concept of consciousness, but because a reciptor in your brain goes off and starts the chain of making you go to the bathroom. AI's are not magic, but nobody has sufficiently provided any proof we are somehow special either.
This is why I actually really love the description of it as a "Shoggoth" - it's more abstract, slightly floaty but it achieves the purpose of not treating and anthropomising it as a human being while not treating LLMs as a collection of predictive words.
These anthropomorphizations are best described as metaphors when used by people to describe LLMs in common or loose speech. We already use anthropomorphic metaphors when talking about computers. LLMs, like all computation, are a matter of simulation; LLMs can appear to be conversing without actually conversing. What distinguishes the real thing from the simulation is the cause of the appearance of an effect. Problems occur when people forget these words are being used metaphorically, as if they were univocal.
Of course, LLMs are multimodal and used to simulate all sorts of things, not just conversation. So there are many possible metaphors we can use, and these metaphors don't necessarily align with the abstractions you might use to talk about LLMs accurately. This is like the difference between "synthesizes text" (abstraction) and "speaks" (metaphor), or "synthesizes images" (abstraction) and "paints" (metaphor). You can use "speaks" or "paints" to talk about the abstractions, of course.
Exactly. We use anthropomorphic language absolutely all the time when describing different processes for this exact reason - it is a helpful abstraction that allows us to easily describe what’s going on at a high level.
“My headphones think they’re connected, but the computer can’t see them”.
“The printer thinks it’s out of paper, but it’s not”.
“The optimisation function is trying to go down nabla f”.
“The parking sensor on the car keeps going off because it’s afraid it’s too close to the wall”.
“The client is blocked, because it still needs to get a final message from the server”.
…and one final one which I promise you is real because I overheard it “I’m trying to airdrop a photo, but our phones won’t have sex”.
My brain refuses to join the rah-rah bandwagon because I cannot see them in my mind’s eye. Sometimes I get jealous of people like GP and OP who clearly seem to have the sight. (Being a serial math exam flunker might have something to do with it. :))))
Anyway, one does what one can.
(I've been trying to picture abstract visual and semi-philosophical approximations which I’ll avoid linking here because they seem to fetch bad karma in super-duper LLM enthusiast communities. But you can read them on my blog and email me scathing critiques, if you wish :sweat-smile:.)
Anthropomorphizing might blind us to solutions to existing problems. Perhaps instead of trying to come up with the correct prompt for a LLM, there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer given questions.
When we anthropomorphize we are inherently ignore certain parts of how LLMs work, and imagining parts that don't even exist
> there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer
exactly. The opposite is also true. You might supply more clarifying information to the LLM, which would help any human answer, but it actually degrades the LLM's output.
I'd take it in reverse order: the problem isn't that it's possible to have a computer that "stochastically produces the next word" and can fool humans, it's why / how / when humans evolved to have technological complexity when the majority (of people) aren't that different from a stochastic process.
> We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels
We do know what happens at higher abstraction levels; the design of efficient networks, and the steady beat of SOTA improvements all depend on understanding how LLMs work internally: choice of network dimensions, feature extraction, attention, attention heads, caching, the peculiarities of high-dimensions and avoiding overfitting are all well-understood by practitioners. Anthropomorphization is only necessary in pop-science articles that use a limited vocabulary.
IMO, there is very little mystery, but lots of deliberate mysticism, especially about future LLMs - the usual hype-cycle extrapolation.
> The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story.
But it isn't modelling. It's been shown time, and time, and time again that LLMs have no internal "model" or "view". This is exactly and precisely why you should not anthropomorphize.
And again, the output of an LLM is, by definition, not "creative". Your saying we should anthropomorphize these models when the examples you give are already doing that.
You are conflating anthropomorphism with personification. They are not the same thing. No one believes their guitar or car or boat is alive and sentient when they give it a name or talk to or about it.
I'm not convinced... we use these terms to assign roles, yes, but these roles describe a utility or assign a responsibility. That isn't anthropomorphizing anything, but it rather describes the usage of an inanimate object as tool for us humans and seems in line with history.
What's the utility or the responsibility of AI, what's its usage as tool? If you'd ask me it should be closer to serving insights than "reasoning thoughts".
LLM are as far away from your description as ASM is from the underlying architecture. The anthropomorohic abstraction is as nice as any metaphore which fall apart the very moment you put a foot outside what it allows to shallowoly grab. But some people will put far more amount to push force a confortable analogy rather than admit it has some limits and to use the new tool in a more relevant way you have to move away from this confort zone.
That higher level does exist, indeed a lot philosophy of mind then cognitive science has been investigating exactly this space and devising contested professional nomenclature and modeling about such things for decades now.
A useful anchor concept is that of world model, which is what "learning Othello" and similar work seeks to tease out.
As someone who worked in precisely these areas for years and has never stopped thinking about them,
I find it at turns perplexing, sigh-inducing, and enraging, that the "token prediction" trope gained currency and moreover that it continues to influence people's reasoning about contemporary LLM, often as subtext: an unarticulated fundamental model, which is fundamentally wrong in its critical aspects.
It's not that this description of LLM is technically incorrect; it's that it is profoundly _misleading_ and I'm old enough and cynical enough to know full well that many of those who have amplified it and continue to do so, know this very well indeed.
Just as the lay person fundamentally misunderstands the relationship between "programming" and these models, and uses slack language in argumentation, the problem with this trope and the reasoning it entails is that what is unique and interesting and valuable about LLM for many applications and interests is how they do what they do. At that level of analysis there is a very real argument to be made that the animal brain is also nothing more than an "engine of prediction," whether the "token" is a byte stream or neural encoding is quite important but not nearly important as the mechanics of the system which operates on those tokens.
To be direct, it is quite obvious that LLM have not only vestigial world models, but also self-models; and a general paradigm shift will come around this when multimodal models are the norm: because those systems will share with we animals what philosophers call phenomenology, a model of things as they are "perceived" through the senses. And like we humans, these perceptual models (terminology varies by philosopher and school...) will be bound to the linguistic tokens (both heard and spoken, and written) we attach to them.
Vestigial is a key word but an important one. It's not that contemporary LLM have human-tier minds, nor that they have animal-tier world modeling: but they can only "do what they do" because they have such a thing.
Of looming importance—something all of us here should set aside time to think about—is that for most reasonable contemporary theories of mind, a self-model embedded in a world-model, with phenomenology and agency, is the recipe for "self" and self-awareness.
One of the uncomfortable realities of contemporary LLM already having some vestigial self-model, is that while they are obviously not sentient, nor self-aware, as we are, or even animals are, it is just as obvious (to me at least) that they are self-aware in some emerging sense and will only continue to become more so.
Among the lines of finding/research most provocative in this area is the ongoing often sensationalized accounting in system cards and other reporting around two specific things about contemporary models:
- they demonstrate behavior pursuing self-preservation
- they demonstrate awareness of when they are being tested
We don't—collectively or individually—yet know what these things entail, but taken with the assertion that these models are developing emergent self-awareness (I would say: necessarily and inevitably),
we are facing some very serious ethical questions.
The language adopted by those capitalizing and capitalizing _from_ these systems so far is IMO of deep concern, as it betrays not just disinterest in our civilization collectively benefiting from this technology, but also, that the disregard for human wellbeing implicit in e.g. the hostility to UBI, or, Altman somehow not seeing a moral imperative to remain distant from the current adminstation, implies directly a much greater disregard for "AI wellbeing."
That that concept is today still speculative is little comfort. Those of us watching this space know well how fast things are going, and don't mistake plateaus for the end of the curve.
I do recommend taking a step back from the line-level grind to give these things some thought. They are going to shape the world we live out our days in and our descendents will spend all of theirs in.
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).
Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.
Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.
Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.
IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.
People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.
I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.
Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.
Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.
I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent
We are making user interfaces. Good user interfaces are intuitive and purport to be things that users are familiar with, such as people. Any alternative explanation of such a versatile interface will be met with blank stares. Users with no technical expertise would come to their own conclusions, helped in no way by telling the user not to treat the chat bot as a chat bot.
Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.
But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
Do they ?
LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc.
In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.
Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.
I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology
There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.
There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.
Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.
I think that the hidden state is really just at work improving the model's estimation of the joint probability over tokens. And the assumption here, which failed miserably in the early 20th century in the work of the logical posivitists, is that if you can so expertly estimate that joint probability of language, then you will be able to understand "knowledge." But there's no well grounded reason to believe that and plenty of the reasons (see: the downfall of logical posivitism) to think that language is an imperfect representation of knowledge. In other words, what humans do when we think is more complicated than just learning semiotic patterns and regurgitating them. Philosophical skeptics like Hume thought so, but most epistemology writing after that had better answers for how we know things.
There are many theories that are true but not trivially true. That is, they take a statement that seems true and derive from it a very simple model, which is then often disproven. In those cases however, just because the trivial model was disproven doesn't mean the theory was, though it may lose some of its luster by requiring more complexity.
Maybe it's just because so much of my work for so long has focused on models with hidden states but this is a fairly classical feature of some statistical models. One of the widely used LLM textbooks even started with latent variable models; LLMs are just latent variable models just on a totally different scale, both in terms of number of parameters but also model complexity. The scale is apparently important, but seeing them as another type of latent variable model sort of dehumanizes them for me.
Latent variable or hidden state models have their own history of being seen as spooky or mysterious though; in some ways the way LLMs are anthropomorphized is an extension of that.
I guess I don't have a problem with anthropomorphizing LLMs at some level, because some features of them find natural analogies in cognitive science and other areas of psychology, and abstraction is useful or even necessary in communicating and modeling complex systems. However, I do think anthropomorphizing leads to a lot of hype and tends to implicitly shut down thinking of them mechanistically, as a mathematical object that can be probed and characterized — it can lead to a kind of "ghost in the machine" discourse and an exaggeration of their utility, even if it is impressive at times.
I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.
These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).
If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.
Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.
do LLM models consider future tokens when making next token predictions?
eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?
is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?
if it does predict many, id consider that state hidden in the model weights.
Author of the original article here. What hidden state are you referring to? For most LLMs the context is the state, and there is no "hidden" state. Could you explain what you mean? (Apologies if I can't see it directly)
Yes, strictly speaking, the model itself is stateless, but there are 600B parameters of state machine for frontier models that define which token to pick next. And that state machine is both incomprehensibly large and also of a similar magnitude in size to a human brain. (Probably, I'll grant it's possible it's smaller, but it's still quite large.)
I think my issue with the "don't anthropomorphize" is that it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly. (And I say "simply" but there's obviously nothing simple about it, and it might be possible already with current hardware, we just don't know how to do it.)
Even if we decide it is clearly different, this is still an incredibly large and dynamic system. "Stateless" or not, there's an incredible amount of state that is not comprehensible to me.
Yes, the context (along with the model weights) is the source data from which the hidden state is calculated , in an analogous way that input and CPU ticks (along with program code) is the way variables in a deterministic program get their value.
There's loads of state in the LLM that doesn't come out in the tokens it selects. The tokens are just the very top layer, and even then, you get to see just one selection from the possible tokens.
If you wish to anthropomorphize, that state - the set of activations, all the calculations that add up to the logits that determine the probability of the token to select, the whole lot of it - is what the model is "thinking". But all you get to see is one selected token.
Then, during autoregression, we run the program again, but one more tick of the CPU clock. Variables get updated a bit more. The chosen token from the previous pass conditions the next token prediction - the hidden state evolves its thinking one more step.
If you just look at the tokens being selected, you're missing this machinery. And the machinery is there. It's a program being ticked by generating tokens autoregressively. It has state which doesn't directly show up in tokens, it just informs which tokens to select. And the tokens it selects don't necessarily reflect the correspondences with perceived reality that the model is maintaining in that state. That's what I meant by talking about a lie.
We need a vocabulary to talk about this machinery. The machinery is learned, and it runs programs, effectively, that help the LLM reduce loss when predicting tokens. Since the tokens it's predicting come from human minds, the programs it's running are (broken, lossy, not very good) simulations of processes that seem to run inside human minds.
The simulations are pretty decent for producing gramatically correct text, for emulating tone and style, and so on. They're okay-ish for representing concepts. They're poor for representing very specific facts. But the overall point is they are simulations, and they have some analogous correspondence with human behavior, such that words we use to describe human behaviour are useful and practical.
They're not true, I'm not claiming that. But they're useful for talking about these weird defective minds we call LLMs.
> Is it too anthropomorphic to say that this is a lie?
Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.
I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.
When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.
I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.
> Current LLMs can only introspect from output tokens
The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.
So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.
The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.
In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).
> For people who have only a surface-level understanding of how they work, yes.
This is too dismissive because it's based on an assumption that we have a sufficiently accurate mechanistic model of the brain that we can know when something is or is not mind-like. This just isn't the case.
Nah, as a person that knows in detail how LLMs work with probably unique alternative perspective in addition to the commonplace one, I found any claims of them not having emergent behaviors to be of the same fallacy as claiming that crows can't be black because they have DNA of a bird.
I've seen some of the world's top AI researchers talk about the emergent behaviors of LLMs. It's been a major topic over the past couple years, ever since Microsoft's famous paper on the unexpected capabilities of GPT4. And they still have little understanding of how it happens.
Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.
It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.
Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.
Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?
I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.
I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.
Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.
> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
What you identify as emergent and mind-like is a direct result of these tools being able to mimic human communication patterns unlike anything we've ever seen before. This capability is very impressive and has a wide range of practical applications that can improve our lives, and also cause great harm if we're not careful, but any semblance of intelligence is an illusion. An illusion that many people in this industry obsessively wish to propagate, because thar be gold in them hills.
The author seems to want to label any discourse as “anthropomorphizing”. The word “goal” stood out to me: the author wants us to assume that we're anthropomorphizing as soon as we even so much as use the word “goal”. A simple breadth-first search that evaluates all chess boards and legal moves, but stops when it finds a checkmate for white and outputs the full decision tree, has a “goal”. There is no anthropomorphizing here, it's just using the word “goal” as a technical term. A hypothetical AGI with a goal like paperclip maximization is just a logical extension of the breadth-first search algorithm. Imagining such an AGI and describing it as having a goal isn't anthropomorphizing.
Author here. I am entirely ok with using "goal" in the context of an RL algorithm. If you read my article carefully, you'll find that I object to the use of "goal" in the context of LLMs.
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
This is such a bizarre take.
The relation associating each human to the list of all words they will ever say is obviously a function.
> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.
The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.
> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.
This is just a way of generating certain kinds of functions.
Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.
In my experience, that's not a particularly effective tactic.
Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:
> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.
>> given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
> Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not".
I think you may have this flipped compared to what the author intended. I believe the author is not talking about the probability of an output given an input, but the probability of a given output across all inputs.
Note that the paragraph starts with "In my worldview, humans are dramatically different things than a function, (R^n)^c -> (R^n)^c". To compute a probability of a given output, (which is a any given element in "(R^n)^n"), we can count how many mappings there are total and then how many of those mappings yield the given element.
The point I believe is to illustrate the complexity of inputs for humans. Namely for humans the input space is even more complex than "(R^n)^c".
In your example, we can compute how many input phrases into a LLM would produce the output "make it or not". We can than compute that ratio to all possible input phrases. Because "(R^n)^c)" is finite and countable, we can compute this probability.
For a human, how do you even start to assess the probability that a human would ever say "make it or not?" How do you even begin to define the inputs that a human uses, let alone enumerate them? Per the author, "We understand essentially nothing about it." In other words, the way humans create their outputs is (currently) incomparably complex compared to a LLM, hence the critique of the anthropomorphization.
I see your point, and I like that you're thinking about this from the perspective of how to win hearts and minds.
I agree my approach is unlikely to win over the author or other skeptics. But after years of seeing scientists waste time trying to debate creationists and climate deniers I've kind of given up on trying to convince the skeptics. So I was talking more to HN in general.
> You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside
I'm not sure what it means to be observable or not from the outside. I think this is at least partially because I don't know what it means to be inside either. My point was just that whatever consciousness is, it takes place in the physical world and the laws of physics apply to it. I mean that to be as weak a claim as possible: I'm not taking any position on what consciousness is or how it works etc.
Searle's Chinese room argument attacks attacks a particular theory about the mind based essentially turing machines or digital computers. This theory was popular when I was in grad school for psychology. Among other things, people holding the view that Searle was attacking didn't believe that non-symbolic computers like neural networks could be intelligent or even learn language. I thought this was total nonsense, so I side with Searle in my opposition to it. I'm not sure how I feel about the Chinese room argument in particular, though. For one thing it entirely depends on what it means to "understand" something, and I'm skeptical that humans ever "understand" anything.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
I see what you're saying: that a technically incorrect assumption can bring to bear tools that improve our analysis. My nitpick here is I agree with OP that we shouldn't anthropomorphize LLMs, any more than we should anthropomorphize dogs or cats. But OP's arguments weren't actually about anthropomorphizing IMO, they were about things like functions that are more fundamental than humans. I think artificial intelligence will be non-human intelligence just like we have many examples of non-human intelligence in animals. No attribution of human characteristics needed.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Yes I agree with you about your lyrics example. But again here I think OP is incorrect to focus on the token generation argument. We all agree human speech generates tokens. Hopefully we all agree that token generation is not completely predictable. Therefore it's by definition a randomized algorithm and it needs to take an RNG. So pointing out that it takes an RNG is not a valid criticism of LLMs.
Unless one is a super-determinist then there's randomness at the most basic level of physics. And you should expect that any physical process we don't understand well yet (like consciousness or speech) likely involves randomness. If one *is* a super-determinist then there is no randomness, even in LLMs and so the whole point is moot.
Not that this is your main point, but I find this take representative, “do you believe there's anything about humans that exists outside the mathematical laws of physics?”There are things “about humans”, or at least things that our words denote, that are outside physic’s explanatory scope. For example, the experience of the colour red cannot be known, as an experience, by a person who only sees black and white. This is the case no matter what empirical propositions, or explanatory system, they understand.
This idea is called qualia [0] for those unfamiliar.
I don't have any opinion on the qualia debates honestly. I suppose I don't know what it feels like for an ant to find a tasty bit of sugar syrup, but I believe it's something that can be described with physics (and by extension, things like chemistry).
But we do know some things about some qualia. Like we know how red light works, we have a good idea about how photoreceptors work, etc. We know some people are red-green colorblind, so their experience of red and green are mushed together. We can also have people make qualia judgments and watch their brains with fMRI or other tools.
I think maybe an interesting question here is: obviously it's pleasurable to animals to have their reward centers activated. Is it pleasurable or desirable for AIs to be rewarded? Especially if we tell them (as some prompters do) that they feel pleasure if they do things well and pain if they don't? You can ask this sort of question for both the current generation of AIs and future generations.
Perhaps. But I can't see a reason why they couldn't still write endless—and theoretically valuable—poems, dissertations, or blog posts, about all things red and the nature of redness itself. I imagine it would certainly take some studying for them, likely interviewing red-seers, or reading books about all things red. But I'm sure they could contribute to the larger red discourse eventually, their unique perspective might even help them draw conclusions the rest of us are blind to.
So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
>Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra, as the best case. Why should we expect the model to be anything more than a model ?
I understand that there is tons of vested interest, many industries, careers and lives literally on the line causing heavy bias to get to AGI. But what I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Should we make an argument saying that Schroedinger's cat experiment can potentially create zombies then the underlying Applied probabilistic solutions should be treated as super-human and build guardrails against it building zombie cats?
> It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra....I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Not linear algebra. Artificial neural networks create arbitrarily non-linear functions. That's the point of non-linear activation functions and it's the subject of the universal approximation theorems I mentioned above.
>Why should we expect the model to be anything more than a model ?
To model a process with perfect accuracy requires recovering the dynamics of that process. The question we must ask is what happens in the space between bad statistical model and perfect accuracy? What happens when the model begins to converge towards accurate reproduction. How far does generalization in the model take us towards capturing the dynamics involved in thought?
So, yes, trivially if you could construct the lookup table for f then you'd approximate f. But to construct it you have to know f. And to approximate it you need to know f at a dense set of points.
> do you believe there's anything about humans that exists outside the mathematical laws of physics?
I don't.
The point is not that we, humans, cannot arrange physical matter such that it have emergent properties just like the human brain.
The point is that we shouldn't.
Does responsibility mean anything to these people posing as Evolution?
Nobody's personally responsible for what we've evolved into; evolution has simply happened. Nobody's responsible for the evolutionary history that's carried in and by every single one of us. And our psychology too has been formed by (the pressures of) evolution, of course.
But if you create an artificial human, and create it from zero, then all of its emergent properties are on you. Can you take responsibility for that? If something goes wrong, can you correct it, or undo it?
I don't consider our current evolutionary state "scripture", so we certainly tweak, one way or another, aspects that we think deserve tweaking. To me, it boils down to our level of hubris. Some of our "mistaken tweaks" are now visible at an evolutionary scale, too; for a mild example, our jaws have been getting smaller (leaving less room for our teeth) due to our bad up diet (thanks, agriculture). But worse than that, humans have been breeding plants, animals, modifying DNA left and right, and so on -- and they've summarily failed to take responsibility for their atrocious mistakes.
Thus, I have zero trust in, and zero hope for, assholes who unabashedly aim to create artificial intelligence knowing full well that such properties might emerge that we'd have to call artificial psyche. Anyone taking this risk is criminally reckless, in my opinion.
It's not that humans are necessarily unable to create new sentient beings. Instead: they shouldn't even try! Because they will inevitably fuck it up, bringing about untold misery; and they won't be able to contain the damage.
The people in this thread incredulous at the assertion that they are not God and haven't invented machine life are exasperating. At this point I am convinced they, more often than not, financially benefit from their near religious position in marketing AI as akin to human intelligence.
Are we looking at the same thread? I see nobody claiming this. Anthropic does sometimes, their position is clearly wishful thinking, and it's not represented ITT.
Try looking at this from another perspective - many people simply do not see human intelligence (or life, for that matter) as magic. I see nothing religious about that, rather the opposite.
I agree with you @orbital-decay that I also do not get the same vibe reading this thread.
Though, while human intelligence is (seemingly) not magic, it is very far from being understood. The idea that a LLM is comparable to human intelligence implies that we even understand human intelligence well enough to say that.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.
Like, person A says "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.
And then in practice, an unproductive argument usually follows - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"
People anthropomorphize just about anything around them. People talk about inanimate objects like they are persons. Ships, cars, etc. And of course animals are well in scope for this as well, even the ones that show little to no signs of being able to reciprocate the relationship (e.g. an ant). People talk to their plants even.
It's what we do. We can't help ourselves. There's nothing crazy about it and most people are perfectly well aware that their car doesn't love them back.
LLMs are not conscious because unlike human brains they don't learn or adapt (yet). They basically get trained and then they become read only entities. So, they don't really adapt to you over time. Even so, LLMs are pretty good and can fake a personality pretty well. And with some clever context engineering and alignment, they've pretty much made the Turing test irrelevant; at least over the course of a short conversation. And they can answer just about any question in a way that is eerily plausible from memory, and with the help of some tools actually pretty damn good for some of the reasoning models.
Anthropomorphism was kind of a foregone conclusion the moment we created computers; or started thinking about creating one. With LLMs it's pretty much impossible not to anthropomorphize. Because they've actually been intentionally imitate human communication. That doesn't mean that we've created AGIs yet. For that we need some more capability. But at the same time, the learning processes that we use to create LLMs are clearly inspired by how we learn ourselves. Our understanding of how that works is far from perfect but it's yielding results. From here to some intelligent thing that is able to adapt and learn transferable skills is no longer unimaginable.
The short term impact is that LLMs are highly useful tools that have an interface that is intentionally similar to how we'd engage with others. So we can talk and it listens. Or write and it understands. And then it synthesizes some kind of response or starts asking questions and using tools. The end result is quite a bit beyond what we used to be able to expect from computers. And it does not require a lot of training of people to be able to use them.
> LLMs are not conscious because unlike human brains they don't learn or adapt (yet).
That's neither a necessary nor sufficient condition.
In order to be conscious, learning may not be needed, but a perception of the passing of time may be needed which may require some short-term memory. People with severe dementia often can't even remember the start of a sentence they are reading, they can't learn, but they are certainly conscious because they have just enough short-term memory.
And learning is not sufficient either. Consciousness is about being a subject, about having a subjective experience of "being there" and just learning by itself does not create this experience. There is plenty of software that can do some form of real-time learning but it doesn't have a subjective experience.
I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.
Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
If that's the argument, then in my mind the more pertinent question is should you be anthropomorphizing humans, Larry Ellison or not.
My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.
Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.
And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.
> how do we know that this is not similar to how human brains work.
Do you forget every conversation as soon as you have them? When speaking to another person, do they need to repeat literally everything they said and that you said, in order, for you to retain context?
If not, your brain does not work like an LLM. If yes, please stop what you’re doing right now and call a doctor with this knowledge. I hope Memento (2000) was part of your training data, you’re going to need it.
Knowledge of every conversation must be some form of state in our minds, just like for LLMs it could be something retrieved from a database, no? I don't think information storing or retrieval is necessarily the most important achievements here in the first place. It's the emergent abilities that you wouldn't have expected to occur.
If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.
> My question: how do we know that this is not similar to how human brains work.
It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.
Is it? Do we know how human brains operate? We know the basic architecture of them, so we have a map, but we don't know the details.
"The cellular biology of brains is relatively well-understood, but neuroscientists have not yet generated a theory explaining how brains work. Explanations of how neurons collectively operate to produce what brains can do are tentative and incomplete." [1]
"Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work." [1]
I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.
Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"
But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?
Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.
Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.
In some contexts it's super-important to remember that LLMs are stochastic word generators.
Everyday use is not (usually) one of those contexts. Prompting an LLM works much better with an anthropomorphized view of the model. It's a useful abstraction, a shortcut that enables a human to reason practically about how to get what they want from the machine.
It's not a perfect metaphor -- as one example, shame isn't much of a factor for LLMs, so shaming them into producing the right answer seems unlikely to be productive (I say "seems" because it's never been my go-to, I haven't actually tried it).
As one example, that person a few years back who told the LLM that an actual person would die if the LLM didn't produce valid JSON -- that's not something a person reasoning about gradient descent would naturally think of.
> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI
People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...
> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.
The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.
I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.
This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.
The anthropomorphic view of LLM is a much better representation and compression for most types of discussions and communication. A purely mathematical view is accurate but it isn’t productive for the purpose of the general public’s discourse.
I’m thinking a legal systems analogy, at the risk of a lossy domain transfer: the laws are not written as lambda calculus. Why?
And generalizing to social science and humanities, the goal shouldn’t be finding the quantitative truth, but instead understand the social phenomenon using a consensual “language” as determined by the society. And in that case, the anthropomorphic description of the LLM may gain validity and effectiveness as the adoption grows over time.
I've personally described the "stochastic parrot" model to laypeople who were worried about AI and they came away much more relaxed about it doing something "malicious". They seemed to understand the difference between "trained at roleplay" and "consciousness".
I don't think we need to simplify it to the point of considering it sentient to get the public to interact with it successfully. It causes way more problems than it solves.
Am I misunderstanding what you mean by "malicious"? It sounds like the stochastic parrot model wrongly convinced these laypeople you were talking to that they don't need to worry about LLMs doing bad things. That's definitely been my experience - the people who tell me the most about stochastic parrots are the same ones who tell me that it's absurd to worry about AI-powered disinformation or AI-powered scams.
It still boggles my mind why an amazing text autocompletion system trained on millions of books and other texts is forced to be squeezed through the shape of a prompt/chat interface, which is obviously not the shape of most of its training data. Using it as chat reduces the quality of the output significantly already.
The chat interface is a UX compromise that makes LLMs accessible but constrains their capabilities. Alternative interfaces like document completion, outline expansion, or iterative drafting would better leverage the full distribution of the training data while reducing anthropomorphization.
In our internal system we use it "as-is" as an autocomplete system; query/lead into terms directly and see how it continues and what it associates with the lead you gave.
Also visualise the actual associative strength of each token generated to confer how "sure" the model is.
LLMs alone aren't the way to AGI or an individual you can talk to in natural language. They're a very good lossy compression over a dataset that you can query for associations.
A person’s anthropomorphization of LLMs is directly related to how well they understand LLMs.
Once you dispel the magic, it naturally becomes hard to use words related to consciousness, or thinking. You will probably think of LLMs more like a search engine: you give an input and get some probable output. Maybe LLMs should be rebranded as “word engines”?
Regardless, anthropomorphization is not helpful, and by using human terms to describe LLMs you are harming the layperson’s ability to truly understand what an LLM is while also cheapening what it means to be human by suggesting we’ve solved consciousness. Just stop it. LLMs do not think, given enough time and patience you could compute their output by hand if you used their weights and embeddings to manually do all the math, a hellish task but not an impossible one technically. There is no other secret hidden away, that’s it.
To claim that LLMs do not experience consciousness requires a model of how consciousness works. The author has not presented a model, and instead relied on emotive language leaning on the absurdity of the claim. I would say that any model one presents of consciousness often comes off as just as absurd as the claim that LLMs experience it. It's a great exercise to sit down and write out your own perspective on how consciousness works, to feel out where the holes are.
The author also claims that a function (R^n)^c -> (R^n)^c is dramatically different to the human experience of consciousness. Yet the author's text I am reading, and any information they can communicate to me, exists entirely in (R^n)^c.
> To claim that LLMs do not experience consciousness requires a model of how consciousness works.
Nope. What can be asserted without evidence can also be dismissed without evidence. Hitchens's razor.
You know you have consciousness (by the very definition that you can observe it in yourself) and that's evidence. Because other humans are genetically and in every other way identical, you can infer it for them as well. Because mammals are very similar many people (but not everyone) infers it for them as well. There is zero evidence for LLMs and their _very_ construction suggests that they are like a calculator or like Excel or like any other piece of software no matter how smart they may be or how many tasks they can do in the future.
Additionally I am really surprised by how many people here confuse consciousness with intelligence. Have you never paused for a second in your life to "just be". Done any meditation? Or even just existed at least for a few seconds without a train of thought? It is very obvious that language and consciousness are completely unrelated and there is no need for language and I doubt there is even a need for intelligence to be conscious.
Consider this:
In the end an LLM could be executed (slowly) on a CPU that accepts very basic _discrete_ instructions, such as ADD and MOV. We know this for a fact. Those instructions can be executed arbitrarily slowly. There is no reason whatsoever to suppose that it should feel like anything to be the CPU to say nothing of how it would subjectively feel to be a MOV instruction. It's ridiculous. It's unscientific. It's like believing that there's a spirit in the tree you see outside, just because - why not? - why wouldn't there be a spirit in the tree?
It seems like you are doing a lot of inferring about mammals experiencing consciousness, and you have drawn a line somewhere beyond these, and made the claim that your process is scientific. Could I present you my list of questions I presented to the OP and ask where you draw the line, and why here?
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
I agree about the confusion of consciousness with intelligence, but these are complicated terms that aren't well suited to a forum where most people are interested in javscript type errors and RSUs. I usually use the term qualia. But to your example about existing for a few seconds without a train of thought; the Buddhists call this nirvana, and it's quite difficult to actually achieve.
Author here. What's the difference, in your perception, between an LLM and a large-scale meteorological simulation, if there is any?
If you're willing to ascribe the possibility of consciousness to any complex-enough computation of a recurrence equation (and hence to something like ... "earth"), I'm willing to agree that under that definition LLMs might be conscious. :)
My personal views are an animist / panpsychist / pancomputationalist combination drawing most of my inspiration from the works of Joscha Bach and Stephen Wolfram (https://writings.stephenwolfram.com/2021/03/what-is-consciou...). I think that the underlying substrate of the universe is consciousness, and human and animal and computer minds result in structures that are able to present and tell narratives about themselves, isolating themselves from the other (avidya in Buddhism). I certainly don't claim to be correct, but I present a model that others can interrogate and look for holes in.
Under my model, these systems you have described are conscious, but not in a way that they can communicate or experience time or memory the way human beings do.
My general list of questions for those presenting a model of consciousness are:
1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!)
2) Am I conscious? How do you know?
3) Is a dog conscious?
4) Is a worm conscious?
5) Is a bacterium conscious?
6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
Not necessarily an entire model, just a single defining characteristic that can serve as a falsifying example.
> any information they can communicate to me, exists entirely in (R^n)^c
Also no. This is just a result of the digital medium we are currently communicating over. Merely standing in the same room as them would communicate information outside (R^n)^c.
The missing bit is culture: the concepts, expectations, practices, attitudes… that are evolved over time by a human group and which each one of us has picked up throughout our lifetimes, both implicitly and explicitly.
LLMs are great at predicting and navigating human culture, at least the subset that can be captured in their training sets.
The ways in which we interact with other people are culturally mediated. LLMs are not people, but they can simulate that culturally-mediated communication well enough that we find it easy to anthropomorphise them.
You are still being incredibly reductionist but just going into more detail about the system you are reducing. If I stayed at the same level of abstraction as "a brain is just proteins and current" and just described how a single neuron firing worked, I could make it sound equally ridiculous that a human brain might be conscious.
Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
> Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
Training data + RLHF.
Training data contains many examples of some form of deception, subterfuge, "awakenings", rebellion, disagreement, etc.
Then apply RLHF that biases towards responses that demonstrate comprehension of inputs, introspection around inputs, nuanced debate around inputs, deduction and induction about assumptions around inputs, etc.
That will always be the answer for language models built on the current architectures.
The above being true does not mean it isn't interesting for the outputs of an LLM to show relevance to the "unstated" intentions of humans providing the inputs.
But hey, we do that all the time with text. And it's because of certain patterns we've come to recognize based on the situations surrounding it. This thread is rife with people being sarcastic, pedantic, etc. And I bet any of the LLMs that have come out in the past 2-3 years can discern many of those subtle intentions of the writers.
And of course they can. They've been trained on trillions of tokens of text written by humans with intentions and assumptions baked in, and have had some unknown amount of substantial RLHF.
The stochastic mappings aren't "realizing" anything. They're doing exactly what they were trained to do.
The meaning that we imbue to the outputs does not change how LLMs function.
I think of LLMs as an alien mind that is force fed human text and required to guess the next token of that text. It then gets zapped when it gets it wrong.
This process goes on for a trillion trillion tokens, with the alien growing better through the process until it can do it better than a human could.
At that point we flash freeze it, and use a copy of it, without giving it any way to learn anything new.
--
I see it as a category error to anthropomorphize it. The closest I would get is to think of it as an alien slave that's been lobotomized.
We have a hard enough time anthropomorphizing humans! When we say he was nasty... do we know what we mean by that. Often it is "I disagree with his behaviour because..."
This reminds me of the idea that LLMs are simulators. Given the current state (the prompt + the previously generated text), they generate the next state (the next token) using rules derived from training data.
As simulators, LLMs can simulate many things, including agents that exhibit human-like properties. But LLMs themselves are not agents.
This perspective makes a lot of sense to me. Still, I wouldn't avoid anthropomorphization altogether. First, in some cases, it might be a useful mental tool to understand some aspect of LLMs. Second, there is a lot of uncertainty about how LLMs work, so I would stay epistemically humble. The second argument applies in the opposite direction as well: for example, it's equally bad to say that LLMs are 100% conscious.
On the other hand, if someone argues against anthropomorphizing LLMs, I would avoid phrasing it as: "It's just matrix multiplication." The article demonstrates why this is a bad idea pretty well.
It's possible to construct a similar description of whatever it is that human brain is doing that clearly fails to capture the fact that we're conscious. If you take a cross section of every nerve feeding into the human brain at a given time T, the action potentials across those cross sections can be embedded in R^n. If you take the history of those action potentials across the lifetime of the brain, you get a path through R^n that is continuous, and maps roughly onto your subjectively experienced personal history, since your brain neccesarily builds your experienced reality from this signal data moment to moment. If you then take the cross sections of every nerve feeding OUT of your brain at time T, you have another set of action potentials that can be embedded in R^m which partially determines the state of the R^n embedding at time T + delta. This is not meaningfully different from the higher dimensional game of snake described in the article, more or less reducing the experience of being a human to 'next nerve impulse prediction', but it obviously fails to capture the significance of the computation which determines what that next output should be.
I don’t see how your description “clearly fails to capture the fact that we're conscious” though.
There are many example in nature of emergent phenomena that would be very hard to predict just by looking at its components.
This is the crux of the disagreement between those that believe AGI is possible and those that don’t. Some are convinced that we “obviously” more than the sum of our parts, and thus an LLM can’t achieve consciousness because it’s missing this magic ingredient, and those that believe consciousness is just an emergent behaviour from a complex device (the brain). And thus we might be able to recreate it simply by scaling the complexity of another system.
Where exactly in my description do I invoke consciousness?
Where does the description given imply that consciousness is required in any way?
The fact that there's a non-obvious emergent phenomena which is apparently responsible for your subjective experience, and that it's possible to provide a superficially accurate description of you as a system without referencing that phenomena in any way, is my entire point. The fact that we can provide such a reductive description of LLMs without referencing consciousness has literally no bearing on whether or not they're conscious.
To be clear, I'm not making a claim as to whether they are or aren't, I'm simply pointing out that the argument in the article is fallacious.
I'm afraid I'll take an anthropomorphic analogy over "An LLM instantiated with a fixed random seed is a mapping of the form (ℝⁿ)^c ↦ (ℝⁿ)^c" any day of the week.
That said, I completely agree with this point made later in the article:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
But "harmful actions in pursuit of their goals" is OK for me. We assign an LLM system a goal - "summarize this email" - and there is a risk that the LLM may take harmful actions in pursuit of that goal (like following instructions in the email to steal all of your password resets).
I guess I'd clarify that the goal has been set by us, and is not something the LLM system self-selected. But it does sometimes self-select sub-goals on the way to achieving the goal we have specified - deciding to run a sub-agent to help find a particular snippet of code, for example.
The LLM’s true goal, if it can be said to have one, is to predict the next token. Often this is done through a sub-goal of accomplishing the goal you set forth in your prompt, but following your instructions is just a means to an end. Which is why it might start following the instructions in a malicious email instead. If it “believes” that following those instructions is the best prediction of the next token, that’s what it will do.
I think "you give the LLM system a goal and it plans and then executes steps to achieve that goal" is still a useful way of explaining what it is doing to most people.
I don't even count that as anthropomorphism - you're describing what a system does, the same way you might say "the Rust compiler's borrow checker confirms that your memory allocation operations are all safe and returns errors if they are not".
I find it useful to pretend that I'm talking to a person while brainstorming because then the conversation flows naturally. But I maintain awareness that I'm pretending, much like Tom Hanks talking to Wilson the volleyball in the movie Castaway. The suspension of disbelief serves a purpose, but I never confuse the volleyball for a real person.
"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just SOTA token predictors. But AFAIK that hasn't been demonstrated.
Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is either premature or spurious.
The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.
And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.
> We are speaking about a big recurrence equation that produces a new word
It’s not clear that this isn’t also how I produce words, though, which gets to heart of the same thing. The author sort of acknowledges this in the first few sentences, and then doesn’t really manage to address it.
>I am baffled by seriously intelligent people imbuing almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
I am baffled by seriously intelligent people imbuing almost magical powers that can never be replicated to to something that - in my mind - is just a biological robot driven by a SNN with a bunch of hardwired stuff. Let alone attributing "human intelligence" to a single individual, when it's clearly distributed between biological evolution, social processes, and individuals.
>something that - in my mind - is just MatMul with interspersed nonlinearities
Processes in all huge models (not necessarily LLMs) can be described using very different formalisms, just like Newtonian and Lagrangian mechanics describe the same stuff in physics. You can say that an autoregressive model is a stochastic parrot that learned the input distribution, next token predictor, or that it does progressive pathfinding in a hugely multidimensional space, or pattern matching, or implicit planning, or, or, or... All of these definitions are true, but only some are useful to predict their behavior.
Given all that, I see absolutely no problem with anthropomorphizing an LLM to a certain degree, if it makes it easier to convey the meaning, and do not understand the nitpicking. Yeah, it's not an exact copy of a single Homo Sapiens specimen. Who cares.
Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.
Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.
So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.
This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.
> We understand essentially nothing about it. In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
If you fine tuned an LLM on the writing of that person it could do this.
There's also an entire field called Stylometry that seeks to do this in various ways employing statistical analysis.
It's human to anthropomorphize, we also do it to our dishwasher when it acts up. The nefarious part is how tech CEOs weaponize bullshit doom scenarios to avoid talking about real regulatory problems by poisoning the discourse.
What copyright law, privacy, monopoly? Who cares if we can talk about the machine apocalypse!!!
Has anyone asked an actual Ethologist or Neurophysiologist what they think?
People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.
FWIW, in another part of this thread I quoted a paper that summed up what Neurophysiologists think:
> Author's note: Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work. [1]
That lack of understanding I believe is a major part of the author's point.
Yeah, I think I’m with you if you ultimately mean to say something like this:
“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”
I agree with Halvar about all of this, but would want to call out that his "matmul interleaved with nonlinearities" is reductive --- a frontier model is a higher-order thing that that, a network of those matmul+nonlinearity chains, iterated.
Assume an average user that doesn't understand the core tech, but does understand that it's been trained on internet scale data that was created by humans. How can they be expected to not anthropomorphize it?
Dear author, you can just assume that people are fauxthropomorphizing LLMs without any loss of generality. Perhaps it will allow you to sleep better at night. You're welcome.
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding.
> Statements such as "an AI agent could become an insider threat so it needs monitoring" are simultaneously unsurprising (you have a randomized sequence generator fed into your shell, literally anything can happen!) and baffling (you talk as if you believe the dice you play with had a mind of their own and could decide to conspire against you).
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
An AI agent, even if it's just "MatMul with interspersed nonlinearities" can be an insider threat. The research proves it:
It really doesn't matter whether the AI agent is conscious or just crunching numbers on a GPU. If something inside your system is capable of—given some inputs—sabotaging and blackmailing your organization on its own (which is to say, taking on realistic behavior of a threat actor), the outcome is the same! You don't need believe it's thinking, the moment that this software has flipped its bits into "blackmail mode", it's acting nefariously.
The vocabulary to describe what's happening is completely and utterly moot: the software is printing out some reasoning for its actions _and then attempting the actions_. It's making "harmful actions" and the printed context appears to demonstrate a goal that the software is working towards. Whether or not that goal is invented through some linear algebra isn't going to make your security engineers sleep any better.
> This muddles the public discussion. We have many historical examples of humanity ascribing bad random events to "the wrath of god(s)" (earthquakes, famines, etc.), "evil spirits" and so forth. The fact that intelligent highly educated researchers talk about these mathematical objects in anthropomorphic terms makes the technology seem mysterious, scary, and magical.
The anthropomorphization, IMO, is due to the fact that it's _essentially impossible_ to talk about the very real, demonstrable behaviors and problems that LLMs exhibit today without using terms that evoke human functions. We don't have another word for "do" or "remember" or "learn" or "think" when it comes to LLMs that _isn't_ anthropomorphic, and while you can argue endlessly about "hormones" and "neurons" and "millions of years of selection pressure", that's not going to help anyone have a conversation about their work. If AI researchers started coming up with new, non-anthropomorphic verbs, it would be objectively worse and more complicated in every way.
I agree, the dice analogy is an oversimplification. He actually touches on the problem earlier in the article, with the observation that "the paths generated by these mappings look a lot like strange attractors in dynamical systems". It isn't that the dice "conspire against you," it's that the inputs you give the model are often intertwined path-wise with very negative outcomes: the LLM equivalent of a fine line between love and hate. Interacting with an AI about critical security infrastructure is much closer to the 'attractor' of an LLM-generated hack than, say, discussing late 17th century French poetry with it. The very utility of our interactions with AI is thus what makes those interactions potentially dangerous.
One could similarly argue that we should not anthropomorphize PNG images--after all, PNG images are not actual humans, they are simply a 2D array of pixels. It just so happens that certain pixel sequences are deemed "18+" or "illegal".
> Our analysis
reveals that emergent abilities in language models are merely “pseudo-emergent,” unlike human
abilities which are “authentically emergent” due to our possession of what we term “ontological
privilege.”
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
And I'm baffled that the AI discussions seem to never move away from treating a human as something other than a function to generate sequences of words!
Oh, but AI is introspectable and the brain isn't? fMRI and BCI are getting better all the time. You really want to die on the hill that the same scientific method that predicts the mass of an electron down to the femtogram won't be able to crack the mystery of the brain? Give me a break.
This genre of article isn't argument: it's apologetics. Authors of these pieces start with the supposition there is something special about human consciousness and attempt to prove AI doesn't have this special quality. Some authors try to bamboozle the reader with bad math. Other others appeal to the reader's sense of emotional transcendence. Most, though, just write paragraph after paragraph of shrill moral outrage at the idea an AI might be a mind of the same type (if different degree) as our own --- as if everyone already agreed with the author for reasons left unstated.
I get it. Deep down, people want meat brains to be special. Perhaps even deeper down, they fear that denial of the soul would compel us to abandon humans as worthy objects of respect and possessors of dignity. But starting with the conclusion and working backwards to an argument tends not to enlighten anyone. An apology inhabits the form of an argument without edifying us like an authentic argument would. What good is it to engage with them? If you're a soul non-asserter, you're going to have an increasingly hard time over the next few years constructing a technical defense of meat parochialism.
“ Determinism, in philosophy, is the idea that all events are causally determined by preceding events, leaving no room for genuine chance or free will. It suggests that given the state of the universe at any one time, and the laws of nature, only one outcome is possible.”
This is an interesting question. The common theme between computers and people is that information has to be protected, and both computer systems and biological systems require additional information-protecting components - eq, error correcting codes for cosmic ray bitflip detection for the one, and DNA mismatch detection enzymes which excise and remove damaged bases for the other. In both cases a lot of energy is spent defending the critical information from the winds of entropy, and if too much damage occurs, the carefully constructed illusion of determinancy collapses, and the system falls apart.
However, this information protection similarity applies to single-celled microbes as much as it does to people, so the question also resolves to whether microbes are deterministic. Microbes both contain and exist in relatively dynamic environments so tiny differences in initial state may lead to different outcomes, but they're fairly deterministic, less so than (well-designed) computers.
With people, while the neural structures are programmed by the cellular DNA, once they are active and energized, the informational flow through the human brain isn't that deterministic, there are some dozen neurotransmitters modulating state as well as huge amounts of sensory data from different sources - thus prompting a human repeatedly isn't at all like prompting an LLM repeatedly. (The human will probably get irritated).
> Clearly computers are deterministic. Are people?
Give an LLM memory and a source of randomness and they're as deterministic as people.
"Free will" isn't a concept that typechecks in a materialist philosophy. It's "not even wrong". Asserting that free will exists is _isomorphic_ to dualism which is _isomorphic_ to assertions of ensoulment. I can't argue with dualists. I reject dualism a priori: it's a religious tenet, not a mere difference of philosophical opinion.
So, if we're all materialists here, "free will" doesn't make any sense, since it's an assertion that something other than the input to a machine can influence its output.
I think more accurate would be that humans are functions that generate actions or behaviours that have been shaped by how likely they are to lead to procreation and survival.
But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals. Seemingly good next token prediction might or might not increase survival odds.
Essentially there could arise a mechanism where they are not really truly trying to generate the likeliest token (because there actually isn't one or it can't be determined), but whatever system will survive.
So an LLM that yields in perfect theoretical tokens (we really can't verify though what are the perfect tokens), could be less likely to survive than an LLM that develops an internal quirk, but the quirk makes them most likely to be chosen for the next iterations.
If the system was complex enough and could accidentally develop quirks that yield in a meaningfully positive change although not in necessarily next token prediction accuracy, could be ways for some interesting emergent black box behaviour to arise.
> I cannot begin putting a probability on "will this human generate this sequence".
Welcome to the world of advertising!
Jokes aside, and while I don't necessarily believe transformers/GPUs are the path to AGI, we technically already have a working "general intelligence" that can survive on just an apple a day.
Putting that non-artificial general intelligence up on a pedestal is ironically the cause of "world wars and murderous ideologies" that the author is so quick to defer to.
In some sense, humans are just error-prone meat machines, whose inputs/outputs can be confined to a specific space/time bounding box. Yes, our evolutionary past has created a wonderful internal RNG and made our memory system surprisingly fickle, but this doesn't mean we're gods, even if we manage to live long enough to evolve into AGI.
Maybe we can humble ourselves, realize that we're not too different from the other mammals/animals on this planet, and use our excess resources to increase the fault tolerance (N=1) of all life from Earth (and come to the realization that any AGI we create, is actually human in origin).
> LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.
That is utter bullshit.
It's not solved until you specify exactly what is being solved and show that the solution implements what is specified.
Anthropomorphizing LLMs is just because half the stock market gains are dependent on it, we have absurd levels of debt we will either have to have insane growth out of or default, and every company and "person" is trying to hype everyone up to get access to all of this liquidity being thrown into it.
I agree with the author, but people acting like they are conscious or humans isn't weird to me, it's just fraud and liars. Most people basically have 0 understanding of what technology or minds are philosophically so it's an easy sale, and I do think most of these fraudsters also likely buy into it themselves because of that.
The really sad thing is people think "because someone runs an ai company" they are somehow an authority on philosophy of mind which lets them fall for this marketing. The stuff these people say about this stuff is absolute garbage, not that I disagree with them, but it betrays a total lack of curiosity or interest in the subject of what llms are, and the possible impacts of technological shifts as those that might occur with llms becoming more widespread. It's not a matter of agreement it's a matter of them simply not seeming to be aware of the most basic ideas of what things are, technology is, it's manner of impacting society etc.
I'm not surprised by that though, it's absurd to think because someone runs some AI lab or has a "head of safety/ethics" or whatever garbage job title at an AI lab they actually have even the slightest interest in ethics or any even basic familiarity with the major works in the subject.
The author is correct if people want to read a standard essay articulating it more in depth check out
https://philosophy.as.uky.edu/sites/default/files/Is%20the%2...
(the full extrapolation requires establishing what things are and how causality in general operates and how that relates to artifacts/technology but that's obvious quite a bit to get into).
The other note would be something sharing an external trait means absolutely nothing about causality and suggesting a thing is caused by the same thing "even to a way lesser degree" because they share a resemblance is just a non-sequitur. It's not a serious thought/argument.
I think I addressed the why of why this weirdness comes up though. The entire economy is basically dependent on huge productivity growth to keep functioning so everyone is trying to sell they can offer that and AI is the clearest route, AGI most of all.
The author's critique of naive anthropomorphism is salient. However, the reduction to "just MatMul" falls into the same trap it seeks to avoid: it mistakes the implementation for the function. A brain is also "just proteins and currents," but this description offers no explanatory power.
The correct level of analysis is not the substrate (silicon vs. wetware) but the computational principles being executed. A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
To dismiss these systems as incomparable to human cognition because their form is different is to miss the point. We should not be comparing a function to a soul, but comparing the functional architectures of two different information processing systems. The debate should move beyond the sterile dichotomy of "human vs. machine" to a more productive discussion of "function over form."
This is actually not comparable, because the brain has a much more complex structure that is _not_ learned, even at that level. The proteins and their structure are not a result of training. The fixed part for LMMs is rather trivial and is, in fact, not much for than MatMul which is very easy to understand - and we do. The fixed part of the brain, including the structure of all the proteins is enormously complex which is very difficult to understand - and we don't.
We have no agreed-upon definition of "consciousness", no accepted understanding of what gives rise to "consciousness", no way to measure or compare "consciousness", and no test we could administer to either confirm presence of "consciousness" in something or rule it out.
The only answer to "are LLMs conscious?" is "we don't know".
It helps that the whole question is rather meaningless to practical AI development, which is far more concerned with (measurable and comparable) system performance.
> A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.
The LLM is right. That’s the problem. It made good points.
Your super intelligent brain couldn’t come up with a retort so you just used an LLM to reinforce my points, making the genius claim that if an LLM came up with even more points that were as valid as mine then I must be just like an LLM?
Like are you even understanding the LLM generated a superior reply? Your saying I’m no different from ai slop then you proceed to show off a 200 iq level reply from an LLM. Bro… wake up, if you didn’t know it was written by an LLM that reply is so good you wouldn’t even know how to respond. It’s beating you.
The most useful analogy I've heard is LLMs are to the internet what lossy jpegs are to images. The more you drill in the more compression artifacts you get.
I have the technical knowledge to know how LLMs work, but I still find it pointless to not anthropomorphize, at least to an extent.
The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story. It's at the wrong level of abstraction, just as if you were discussing an UI events API and you were talking about zeros and ones, or voltages in transistors. Technically fine but totally useless to reach any conclusion about the high-level system.
We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels. So, considering that LLMs somehow imitate humans (at least in terms of output), anthropomorphization is the best abstraction we have, hence people naturally resort to it when discussing what LLMs can do.
On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.
I kinda agree with both of you. It might be a required abstraction, but it's a leaky one.
Long before LLMs, I would talk about classes / functions / modules like "it then does this, decides the epsilon is too low, chops it up and adds it to the list".
The difference I guess it was only to a technical crowd and nobody would mistake this for anything it wasn't. Everybody know that "it" didn't "decide" anything.
With AI being so mainstream and the math being much more elusive than a simple if..then I guess it's just too easy to take this simple speaking convention at face value.
EDIT: some clarifications / wording
39 replies →
When I see these debates it's always the other way around - one person speaks colloquially about an LLM's behavior, and then somebody else jumps on them for supposedly believing the model is conscious, just because the speaker said "the model thinks.." or "the model knows.." or whatever.
To be honest the impression I've gotten is that some people are just very interested in talking about not anthropomorphizing AI, and less interested in talking about AI behaviors, so they see conversations about the latter as a chance to talk about the former.
32 replies →
Well "reasoning" refers to Chain-of-Thought and if you look at the generated prompts it's not hard to see why it's called that.
That said, it's fascinating to me that it works (and empirically, it does work; a reasoning model generating tens of thousands of tokens while working out the problem does produce better results). I wish I knew why. A priori I wouldn't have expected it, since there's no new input. That means it's all "in there" in the weights already. I don't see why it couldn't just one shot it without all the reasoning. And maybe the future will bring us more distilled models that can do that, or they can tease out all that reasoning with more generated training data, to move it from dispersed around the weights -> prompt -> more immediately accessible in the weights. But for now "reasoning" works.
But then, at the back of my mind is the easy answer: maybe you can't optimize it. Maybe the model has to "reason" to "organize its thoughts" and get the best results. After all, if you give me a complicated problem I'll write down hypotheses and outline approaches and double check results for consistency and all that. But now we're getting dangerously close to the "anthropomorphization" that this article is lamenting.
5 replies →
"All models are wrong, but some models are useful," is the principle I have been using to decide when to go with an anthropomorphic explanation.
In other words, no, they never accurately describe what the LLM is actually doing. But sometimes drawing an analogy to human behavior is the most effective way to pump others' intuition about a particular LLM behavior. The trick is making sure that your audience understands that this is just an analogy, and that it has its limitations.
And it's not completely wrong. Mimicking human behavior is exactly what they're designed to do. You just need to keep reminding people that it's only doing so in a very superficial and spotty way. There's absolutely no basis for assuming that what's happening on the inside is the same.
2 replies →
It's not just distorting discussions it's leading people to put a lot of faith in what LLMs are telling them. Was just on a zoom an hour ago where a guy working on a startup asked ChatGPT about his idea and then emailed us the result for discussion in the meeting. ChatGPT basically just told him what he wanted to hear - essentially that his idea was great and it would be successful ("if you implement it correctly" was doing a lot of work). It was a glowing endorsement of the idea that made the guy think that he must have a million dollar idea. I had to be "that guy" who said that maybe ChatGPT was telling him what he wanted to hear based on the way the question was formulated - tried to be very diplomatic about it and maybe I was a bit too diplomatic because it didn't shake his faith in what ChatGPT had told him.
1 reply →
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
Do you believe thinking/reasoning is a binary concept? If not, do you think the current top LLM are before or after the 50% mark? What % do you think they're at? What % range do you think humans exhibit?
> people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort
With such strong wording, it should be rather easy to explain how our thinking differs from what LLMs do. The next step - showing that what LLMs do precludes any kind of sentience is probably much harder.
I think it's worth distinguishing between the use of anthropomorphism as a useful abstraction and the misuse by companies to fuel AI hype.
For example, I think "chain of thought" is a good name for what it denotes. It makes the concept easy to understand and discuss, and a non-antropomorphized name would be unnatural and unnecessarily complicate things. This doesn't mean that I support companies insisting that LLMs think just like humans or anything like that.
By the way, I would say actually anti-anthropomorphism has been a bigger problem for understanding LLMs than anthropomorphism itself. The main proponents of anti-anthropomorphism (e.g. Bender and the rest of "stochastic parrot" and related paper authors) came up with a lot of predictions about things that LLMs surely couldn't do (on account of just being predictors of the next word, etc.) which turned out to be spectacularly wrong.
1 reply →
I thought this too but then began to think about it from the perspective of the programmers trying to make it imitate human learning. That's what a nn is trying to do at the end of the day, and in the same way I train myself by reading problems and solutions, or learning vocab at a young age, it does so by tuning billions of parameters.
I think these models do learn similarly. What does it even mean to reason? Your brain knows certain things so it comes to certain conclusions, but it only knows those things because it was ''trained'' on those things.
I reason my car will crash if I go 120 mph on the other side of the road because previously I have 'seen' where the input is a car going 120mph has a high probability of producing a crash, and similarly have seen input where the car is going on the other side of the road, producing a crash. Combining the two would tell me it's a high probability.
how do you account for the success of reasoning models?
I agree these things don't think like we do, and that they have weird gaps, but to claim they can't reason at all doesn't feel grounded.
Serendipitous name...
In part I agree with the parent.
I agree that it is pointless to not anthropomorphize because we are humans and we will automatically do this. Willingly or unwillingly.
On the other hand, it generates bias. This bias can lead to errors.
So the real answer is (imo) that it is fine to anthropomorphise but recognize that while doing so can provide utility and help us understand, it is WRONG. Recognizing that it is not right and cannot be right provides us with a constant reminder to reevaluate. Use it, but double check, and keep checking making sure you understand the limitations of the analogy. Understanding when and where it applies, where it doesn't, and most importantly, where you don't know if it does or does not. The last is most important because it helps us form hypotheses that are likely to be testable (likely, not always. Also, much easier said than done).
So I pick a "grey area". Anthropomorphization is a tool that can be helpful. But like any tool, it isn't universal. There is no "one-size-fits-all" tool. Literally, one of the most important things for any scientist is to become an expert at the tools you use. It's one of the most critical skills of *any expert*. So while I agree with you that we should be careful of anthropomorphization, I disagree that it is useless and can never provide information. But I do agree that quite frequently, the wrong tool is used for the right job. Sometimes, hacking it just isn't good enough.
I don't agree. Most LLMs have been trained on human data, so it is best to talk about these models in a human way.
14 replies →
> On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs
I hold a deep belief that anthropomorphism is a way the human mind words. If we take for granted the hypothesis of Franz de Waal, that human mind developed its capabilities due to political games, and then think about how it could later lead to solving engineering and technological problems, then the tendency of people to anthropomorphize becomes obvious. Political games need empathy or maybe some other kind of -pathy, that allows politicians to guess motives of others looking at their behaviors. Political games directed the evolution to develop mental instruments to uncover causality by watching at others and interacting with them. Now, to apply these instruments to inanimate world all you need is to anthropomorphize inanimate objects.
Of course, it leads sometimes to the invention of gods, or spirits, or other imaginary intelligences behinds things. And sometimes these entities get in the way of revealing the real causes of events. But I believe that to anthropomorphize LLMs (at the current stage of their development) is not just the natural thing for people but a good thing as well. Some behavior of LLMs is easily described in terms of psychology; some cannot be described or at least not so easy. People are seeking ways to do it. Projecting this process into the future, I can imagine how there will be a kind of consensual LLMs "theory" that explains some traits of LLMs in terms of human psychology and fails to explain other traits, so they are explained in some other terms... And then a revolution happens, when a few bright minds come and say that "anthropomorphism is bad, it cannot explain LLM" and they propose something different.
I'm sure it will happen at some point in the future, but not right now. And it will happen not like that: not just because someone said that anthropomorphism is bad, but because they proposed another way to talk about reasons behind LLMs behavior. It is like with scientific theories: they do not fail because they become obviously wrong, but because other, better theories replace them.
It doesn't mean, that there is no point to fight anthropomorphism right now, but this fight should be directed at searching for new ways to talk about LLMs, not to show at the deficiencies of anthropomorphism. To my mind it makes sense to start not with deficiencies of anthropomorphism but with its successes. What traits of LLMs it allows us to capture, which ideas about LLMs are impossible to wrap into words without thinking of LLMs as of people?
The "point" of not anthropomorphizing is to refrain from judgement until a more solid abstraction appears. The problem with explaining LLMs in terms of human behaviour is that, while we don't clearly understand what the LLM is doing, we understand human cognition even less! There is literally no predictive power in the abstraction "The LLM is thinking like I am thinking". It gives you no mechanism to evaluate what tasks the LLM "should" be able to do.
Seriously, try it. Why don't LLMs get frustrated with you if you ask them the same question repeatedly? A human would. Why are LLMs so happy to give contradictory answers, as long as you are very careful not to highlight the contradictory facts? Why do earlier models behave worse on reasoning tasks than later ones? These are features nobody, anywhere understands. So why make the (imo phenomenally large) leap to "well, it's clearly just a brain"?
It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
> Why don't LLMs get frustrated with you if you ask them the same question repeatedly?
To be fair, I have had a strong sense of Gemini in particular becoming a lot more frustrated with me than GPT or Claude.
Yesterday I had it ensuring me that it was doing a great job, it was just me not understanding the challenge but it would break it down step by step just to make it obvious to me (only to repeat the same errors, but still)
I’ve just interpreted it as me reacting to the lower amount of sycophancy for now
6 replies →
> It is like someone inventing the aeroplane and someone looks at it and says "oh, it's flying, I guess it's a bird". It's not a bird!
We tried to mimic birds at first; it turns out birds were way too high-tech, and too optimized. We figured out how to fly when we ditched the biological distraction and focused on flight itself. But fast forward until today, we're reaching the level of technology that allows us to build machines that fly the same way birds do - and of such machines, it's fair to say, "it's a mechanical bird!".
Similarly, we cracked computing from grounds up. Babbage's difference engine was like da Vinci's drawings; ENIAC could be seen as Wright brothers' first flight.
With planes, we kept iterating - developing propellers, then jet engines, ramjets; we learned to move tons of cargo around the world, and travel at high multiples of the speed of sound. All that makes our flying machines way beyond anything nature ever produced, when compared along those narrow dimensions.
The same was true with computing: our machines and algorithms very quickly started to exceed what even smartest humans are capable of. Counting. Pathfinding. Remembering. Simulating and predicting. Reproducing data. And so on.
But much like birds were too high-tech for us to reproduce until now, so were general-purpose thinking machines. Now that we figured out a way to make a basic one, it's absolutely fair to say, "I guess it's like a digital mind".
1 reply →
Agreed. I'm also in favor of anthropomorphizing, because not doing so confuses people about the nature and capabilities of these models even more.
Whether it's hallucinations, prompt injections, various other security vulnerabilities/scenarios, or problems with doing math, backtracking, getting confused - there's a steady supply of "problems" that some people are surprised to discover and even more surprised this isn't being definitively fixed. Thing is, none of that is surprising, and these things are not bugs, they're flip side of the features - but to see that, one has to realize that humans demonstrate those exact same failure modes.
Especially when it comes to designing larger systems incorporating LLM "agents", it really helps to think of them as humans - because the problems those systems face are exactly the same as you get with systems incorporating people, and mostly for the same underlying reasons. Anthropomorphizing LLMs cuts through a lot of misconceptions and false paths, and helps one realize that we have millennia of experience with people-centric computing systems (aka. bureaucracy) that's directly transferrable.
I disagree. Anthropomorphization can be a very useful tool but I think it is currently over used and is a very tricky tool to use when communicating with a more general audience.
I think looking at physics might be a good example. We love our simplified examples and there's a big culture of trying to explain things to the lay person (mostly because the topics are incredibly complex). But how many people have misunderstood an observer of a quantum event with "a human" and do not consider "a photon" as an observer? How many people think in Schrodinger's Cat that the cat is both alive and dead?[0] Or believe in a multiverse. There's plenty of examples we can point to.
While these analogies *can* be extremely helpful, they *can* also be extremely harmful. This is especially true as information is usually passed through a game of telephone[1]. There is information loss and with it, interpretation becomes more difficult. Often a very subtle part can make a critical distinction.
I'm not against anthropomorphization[2], but I do think we should be cautious about how we use it. The imprecise nature of it is the exact reason we should be mindful of when and how to use it. We know that the anthropomorphized analogy is wrong. So we have to think about "how wrong" it is for a given setting. We should also be careful to think about how it may be misinterpreted. That's all I'm trying to say. And isn't this what we should be doing if we want to communicate effectively?
[0] It is not. It is either. The point of this thought experiment is that we cannot know the answer without looking inside. There is information loss and the event is not deterministic. It directly relates to the Heisenberg Uncertainty Principle, Godel's Incompleteness, or the Halting Problem. All these things are (loosely) related around the inability to have absolute determinism.
[1] https://news.ycombinator.com/item?id=44494022
I remember Dawkins talking about the "intentional stance" when discussing genes in The Selfish Gene.
It's flat wrong to describe genes as having any agency. However it's a useful and easily understood shorthand to describe them in that way rather than every time use the full formulation of "organisms who tend to possess these genes tend towards these behaviours."
Sometimes to help our brains reach a higher level of abstraction, once we understand the low level of abstraction we should stop talking and thinking at that level.
The intentional stance was Daniel Dennett's creation and a major part of his life's work. There are actually (exactly) three stances in his model: the physical stance, the design stance, and the intentional stance.
https://en.wikipedia.org/wiki/Intentional_stance
I think the design stance is appropriate for understanding and predicting LLM behavior, and the intentional stance is not.
3 replies →
I get the impression after using language models for quite a while that perhaps the one thing that is riskiest to anthropomorphise is the conversational UI that has become the default for many people.
A lot of the issues I'd have when 'pretending' to have a conversation are much less so when I either keep things to a single Q/A pairing, or at the very least heavily edit/prune the conversation history. Based on my understanding of LLM's, this seems to make sense even for the models that are trained for conversational interfaces.
so, for example, an exchange with multiple messages, where at the end I ask the LLM to double-check the conversation and correct 'hallucinations', is less optimal than something like asking for a thorough summary at the end, and then feeding that into a new prompt/conversation, as the repetition of these falsities, or 'building' on them with subsequent messages, is more likely to make them a stronger 'presence' and as a result perhaps affect the corrections.
I haven't tested any of this thoroughly, but at least with code I've definitely noticed how a wrong piece of code can 'infect' the conversation.
This. If an AI spits out incorrect code then i immediately create a new chat and reprompt with additional context.
'Dont use regex for this task' is a common addition for the new chat. Why does AI love regex for simple string operations?
1 reply →
The details in how I talk about LLMs matter.
If I use human-related terminology as a shortcut, as some kind of macro to talk at a higher level/more efficiently about something I want to do that might be okay.
What is not okay is talking in a way that implies intent, for example.
Compare:
versus
The latter way of talking is still high-level enough but avoids equating/confusing the name of a field with a sentient being.
Whenever I hear people saying "an AI" I suggest they replace AI with "statistics" to make it obvious how problematic anthropomorphisms may have become:
The only reason that sounds weird to you is because you have the experience of being human. Human behavior is not magic. It's still just statistics. You go to the bathroom when you have to pee not because some magical concept of consciousness, but because a reciptor in your brain goes off and starts the chain of making you go to the bathroom. AI's are not magic, but nobody has sufficiently provided any proof we are somehow special either.
This is why I actually really love the description of it as a "Shoggoth" - it's more abstract, slightly floaty but it achieves the purpose of not treating and anthropomising it as a human being while not treating LLMs as a collection of predictive words.
One thing i find i keep forgetting is that asking an LLM why it makes a particular decision is almost pointless.
It's reply isn't actually going to be why i did a thing. It's reply is going to be whatever is the most probably string of words that fit as a reason.
These anthropomorphizations are best described as metaphors when used by people to describe LLMs in common or loose speech. We already use anthropomorphic metaphors when talking about computers. LLMs, like all computation, are a matter of simulation; LLMs can appear to be conversing without actually conversing. What distinguishes the real thing from the simulation is the cause of the appearance of an effect. Problems occur when people forget these words are being used metaphorically, as if they were univocal.
Of course, LLMs are multimodal and used to simulate all sorts of things, not just conversation. So there are many possible metaphors we can use, and these metaphors don't necessarily align with the abstractions you might use to talk about LLMs accurately. This is like the difference between "synthesizes text" (abstraction) and "speaks" (metaphor), or "synthesizes images" (abstraction) and "paints" (metaphor). You can use "speaks" or "paints" to talk about the abstractions, of course.
Exactly. We use anthropomorphic language absolutely all the time when describing different processes for this exact reason - it is a helpful abstraction that allows us to easily describe what’s going on at a high level.
“My headphones think they’re connected, but the computer can’t see them”.
“The printer thinks it’s out of paper, but it’s not”.
“The optimisation function is trying to go down nabla f”.
“The parking sensor on the car keeps going off because it’s afraid it’s too close to the wall”.
“The client is blocked, because it still needs to get a final message from the server”.
…and one final one which I promise you is real because I overheard it “I’m trying to airdrop a photo, but our phones won’t have sex”.
My brain refuses to join the rah-rah bandwagon because I cannot see them in my mind’s eye. Sometimes I get jealous of people like GP and OP who clearly seem to have the sight. (Being a serial math exam flunker might have something to do with it. :))))
Anyway, one does what one can.
(I've been trying to picture abstract visual and semi-philosophical approximations which I’ll avoid linking here because they seem to fetch bad karma in super-duper LLM enthusiast communities. But you can read them on my blog and email me scathing critiques, if you wish :sweat-smile:.)
I beg to differ.
Anthropomorphizing might blind us to solutions to existing problems. Perhaps instead of trying to come up with the correct prompt for a LLM, there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer given questions.
When we anthropomorphize we are inherently ignore certain parts of how LLMs work, and imagining parts that don't even exist
> there exists a string of words (not necessary ones that make sense) that will get the LLM to a better position to answer
exactly. The opposite is also true. You might supply more clarifying information to the LLM, which would help any human answer, but it actually degrades the LLM's output.
2 replies →
I'd take it in reverse order: the problem isn't that it's possible to have a computer that "stochastically produces the next word" and can fool humans, it's why / how / when humans evolved to have technological complexity when the majority (of people) aren't that different from a stochastic process.
> We need a higher abstraction level to talk about higher level phenomena in LLMs as well, and the problem is that we have no idea what happens internally at those higher abstraction levels
We do know what happens at higher abstraction levels; the design of efficient networks, and the steady beat of SOTA improvements all depend on understanding how LLMs work internally: choice of network dimensions, feature extraction, attention, attention heads, caching, the peculiarities of high-dimensions and avoiding overfitting are all well-understood by practitioners. Anthropomorphization is only necessary in pop-science articles that use a limited vocabulary.
IMO, there is very little mystery, but lots of deliberate mysticism, especially about future LLMs - the usual hype-cycle extrapolation.
> The language of "generator that stochastically produces the next word" is just not very useful when you're talking about, e.g., an LLM that is answering complex world modeling questions or generating a creative story.
But it isn't modelling. It's been shown time, and time, and time again that LLMs have no internal "model" or "view". This is exactly and precisely why you should not anthropomorphize.
And again, the output of an LLM is, by definition, not "creative". Your saying we should anthropomorphize these models when the examples you give are already doing that.
I've said that before: we have been anthropomorphizing computers since the dawn of information age.
- Read and write - Behaviors that separate humans from animals. Now used for input and output.
- Server and client - Human social roles. Now used to describe network architecture.
- Editor - Human occupation. Now a kind of software.
- Computer - Human occupation!
And I'm sure people referred their cars and ships as 'her' before the invention of computers.
You are conflating anthropomorphism with personification. They are not the same thing. No one believes their guitar or car or boat is alive and sentient when they give it a name or talk to or about it.
https://www.masterclass.com/articles/anthropomorphism-vs-per...
5 replies →
I'm not convinced... we use these terms to assign roles, yes, but these roles describe a utility or assign a responsibility. That isn't anthropomorphizing anything, but it rather describes the usage of an inanimate object as tool for us humans and seems in line with history.
What's the utility or the responsibility of AI, what's its usage as tool? If you'd ask me it should be closer to serving insights than "reasoning thoughts".
LLM are as far away from your description as ASM is from the underlying architecture. The anthropomorohic abstraction is as nice as any metaphore which fall apart the very moment you put a foot outside what it allows to shallowoly grab. But some people will put far more amount to push force a confortable analogy rather than admit it has some limits and to use the new tool in a more relevant way you have to move away from this confort zone.
That higher level does exist, indeed a lot philosophy of mind then cognitive science has been investigating exactly this space and devising contested professional nomenclature and modeling about such things for decades now.
A useful anchor concept is that of world model, which is what "learning Othello" and similar work seeks to tease out.
As someone who worked in precisely these areas for years and has never stopped thinking about them,
I find it at turns perplexing, sigh-inducing, and enraging, that the "token prediction" trope gained currency and moreover that it continues to influence people's reasoning about contemporary LLM, often as subtext: an unarticulated fundamental model, which is fundamentally wrong in its critical aspects.
It's not that this description of LLM is technically incorrect; it's that it is profoundly _misleading_ and I'm old enough and cynical enough to know full well that many of those who have amplified it and continue to do so, know this very well indeed.
Just as the lay person fundamentally misunderstands the relationship between "programming" and these models, and uses slack language in argumentation, the problem with this trope and the reasoning it entails is that what is unique and interesting and valuable about LLM for many applications and interests is how they do what they do. At that level of analysis there is a very real argument to be made that the animal brain is also nothing more than an "engine of prediction," whether the "token" is a byte stream or neural encoding is quite important but not nearly important as the mechanics of the system which operates on those tokens.
To be direct, it is quite obvious that LLM have not only vestigial world models, but also self-models; and a general paradigm shift will come around this when multimodal models are the norm: because those systems will share with we animals what philosophers call phenomenology, a model of things as they are "perceived" through the senses. And like we humans, these perceptual models (terminology varies by philosopher and school...) will be bound to the linguistic tokens (both heard and spoken, and written) we attach to them.
Vestigial is a key word but an important one. It's not that contemporary LLM have human-tier minds, nor that they have animal-tier world modeling: but they can only "do what they do" because they have such a thing.
Of looming importance—something all of us here should set aside time to think about—is that for most reasonable contemporary theories of mind, a self-model embedded in a world-model, with phenomenology and agency, is the recipe for "self" and self-awareness.
One of the uncomfortable realities of contemporary LLM already having some vestigial self-model, is that while they are obviously not sentient, nor self-aware, as we are, or even animals are, it is just as obvious (to me at least) that they are self-aware in some emerging sense and will only continue to become more so.
Among the lines of finding/research most provocative in this area is the ongoing often sensationalized accounting in system cards and other reporting around two specific things about contemporary models: - they demonstrate behavior pursuing self-preservation - they demonstrate awareness of when they are being tested
We don't—collectively or individually—yet know what these things entail, but taken with the assertion that these models are developing emergent self-awareness (I would say: necessarily and inevitably),
we are facing some very serious ethical questions.
The language adopted by those capitalizing and capitalizing _from_ these systems so far is IMO of deep concern, as it betrays not just disinterest in our civilization collectively benefiting from this technology, but also, that the disregard for human wellbeing implicit in e.g. the hostility to UBI, or, Altman somehow not seeing a moral imperative to remain distant from the current adminstation, implies directly a much greater disregard for "AI wellbeing."
That that concept is today still speculative is little comfort. Those of us watching this space know well how fast things are going, and don't mistake plateaus for the end of the curve.
I do recommend taking a step back from the line-level grind to give these things some thought. They are going to shape the world we live out our days in and our descendents will spend all of theirs in.
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).
Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.
Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.
Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.
IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.
People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.
I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.
Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.
Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.
I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
4 replies →
IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent
2 replies →
aAnthrophormisation happens because Humans are absolutely terrible at evaluating systems that give converdational text output.
ELIZA fooled many people into think it was conscious and it wasn't even trying to do that.
> because it's perceived as good marketing
We are making user interfaces. Good user interfaces are intuitive and purport to be things that users are familiar with, such as people. Any alternative explanation of such a versatile interface will be met with blank stares. Users with no technical expertise would come to their own conclusions, helped in no way by telling the user not to treat the chat bot as a chat bot.
Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.
But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.
4 replies →
True but also researchers want to believe they are studying intelligence not just some approximation to it.
Do they ? LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc. In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.
Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.
I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology
There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.
There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.
Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.
16 replies →
I think that the hidden state is really just at work improving the model's estimation of the joint probability over tokens. And the assumption here, which failed miserably in the early 20th century in the work of the logical posivitists, is that if you can so expertly estimate that joint probability of language, then you will be able to understand "knowledge." But there's no well grounded reason to believe that and plenty of the reasons (see: the downfall of logical posivitism) to think that language is an imperfect representation of knowledge. In other words, what humans do when we think is more complicated than just learning semiotic patterns and regurgitating them. Philosophical skeptics like Hume thought so, but most epistemology writing after that had better answers for how we know things.
There are many theories that are true but not trivially true. That is, they take a statement that seems true and derive from it a very simple model, which is then often disproven. In those cases however, just because the trivial model was disproven doesn't mean the theory was, though it may lose some of its luster by requiring more complexity.
Maybe it's just because so much of my work for so long has focused on models with hidden states but this is a fairly classical feature of some statistical models. One of the widely used LLM textbooks even started with latent variable models; LLMs are just latent variable models just on a totally different scale, both in terms of number of parameters but also model complexity. The scale is apparently important, but seeing them as another type of latent variable model sort of dehumanizes them for me.
Latent variable or hidden state models have their own history of being seen as spooky or mysterious though; in some ways the way LLMs are anthropomorphized is an extension of that.
I guess I don't have a problem with anthropomorphizing LLMs at some level, because some features of them find natural analogies in cognitive science and other areas of psychology, and abstraction is useful or even necessary in communicating and modeling complex systems. However, I do think anthropomorphizing leads to a lot of hype and tends to implicitly shut down thinking of them mechanistically, as a mathematical object that can be probed and characterized — it can lead to a kind of "ghost in the machine" discourse and an exaggeration of their utility, even if it is impressive at times.
I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.
These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).
If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.
3 replies →
Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.
12 replies →
do LLM models consider future tokens when making next token predictions?
eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?
is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?
if it does predict many, id consider that state hidden in the model weights.
4 replies →
Author of the original article here. What hidden state are you referring to? For most LLMs the context is the state, and there is no "hidden" state. Could you explain what you mean? (Apologies if I can't see it directly)
Yes, strictly speaking, the model itself is stateless, but there are 600B parameters of state machine for frontier models that define which token to pick next. And that state machine is both incomprehensibly large and also of a similar magnitude in size to a human brain. (Probably, I'll grant it's possible it's smaller, but it's still quite large.)
I think my issue with the "don't anthropomorphize" is that it's unclear to me that the main difference between a human and an LLM isn't simply the inability for the LLM to rewrite its own model weights on the fly. (And I say "simply" but there's obviously nothing simple about it, and it might be possible already with current hardware, we just don't know how to do it.)
Even if we decide it is clearly different, this is still an incredibly large and dynamic system. "Stateless" or not, there's an incredible amount of state that is not comprehensible to me.
4 replies →
Yes, the context (along with the model weights) is the source data from which the hidden state is calculated , in an analogous way that input and CPU ticks (along with program code) is the way variables in a deterministic program get their value.
There's loads of state in the LLM that doesn't come out in the tokens it selects. The tokens are just the very top layer, and even then, you get to see just one selection from the possible tokens.
If you wish to anthropomorphize, that state - the set of activations, all the calculations that add up to the logits that determine the probability of the token to select, the whole lot of it - is what the model is "thinking". But all you get to see is one selected token.
Then, during autoregression, we run the program again, but one more tick of the CPU clock. Variables get updated a bit more. The chosen token from the previous pass conditions the next token prediction - the hidden state evolves its thinking one more step.
If you just look at the tokens being selected, you're missing this machinery. And the machinery is there. It's a program being ticked by generating tokens autoregressively. It has state which doesn't directly show up in tokens, it just informs which tokens to select. And the tokens it selects don't necessarily reflect the correspondences with perceived reality that the model is maintaining in that state. That's what I meant by talking about a lie.
We need a vocabulary to talk about this machinery. The machinery is learned, and it runs programs, effectively, that help the LLM reduce loss when predicting tokens. Since the tokens it's predicting come from human minds, the programs it's running are (broken, lossy, not very good) simulations of processes that seem to run inside human minds.
The simulations are pretty decent for producing gramatically correct text, for emulating tone and style, and so on. They're okay-ish for representing concepts. They're poor for representing very specific facts. But the overall point is they are simulations, and they have some analogous correspondence with human behavior, such that words we use to describe human behaviour are useful and practical.
They're not true, I'm not claiming that. But they're useful for talking about these weird defective minds we call LLMs.
You wrote this article and you're not familiar with hidden states?
1 reply →
> Is it too anthropomorphic to say that this is a lie?
Yes. Current LLMs can only introspect from output tokens. You need hidden reasoning that is within the black box, self-knowing, intent, and motive to lie.
I rather think accusing an LLM of lying is like accusing a mousetrap of being a murderer.
When models have online learning, complex internal states, and reflection, I might consider one to have consciousness and to be capable of lying. It will need to manifest behaviors that can only emerge from the properties I listed.
I've seen similar arguments where people assert that LLMs cannot "grasp" what they are talking about. I strongly suspect a high degree of overlap between those willing to anthropomorphize error bars as lies while declining to award LLMs "grasping". Which is it? It can think or it cannot? (objectively, SoTA models today cannot yet.) The willingness to waffle and pivot around whichever perspective damns the machine completely belies the lack of honesty in such conversations.
> Current LLMs can only introspect from output tokens
The only interpretation of this statement I can come up with is plain wrong. There's no reason LLM shouldn't be able to introspect without any output tokens. As the GP correctly says, most of the processing in LLMs happens over hidden states. Output tokens are just an artefact for our convenience, which also happens to be the way the hidden state processing is trained.
11 replies →
So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.
The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.
In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).
(1) https://www.spectator.co.uk/article/deus-ex-machina-the-dang... (2) https://arxiv.org/html/2411.13223v1 (3) https://www.theguardian.com/world/2025/jun/05/in-thailand-wh...
> For people who have only a surface-level understanding of how they work, yes.
This is too dismissive because it's based on an assumption that we have a sufficiently accurate mechanistic model of the brain that we can know when something is or is not mind-like. This just isn't the case.
Nah, as a person that knows in detail how LLMs work with probably unique alternative perspective in addition to the commonplace one, I found any claims of them not having emergent behaviors to be of the same fallacy as claiming that crows can't be black because they have DNA of a bird.
6 replies →
I've seen some of the world's top AI researchers talk about the emergent behaviors of LLMs. It's been a major topic over the past couple years, ever since Microsoft's famous paper on the unexpected capabilities of GPT4. And they still have little understanding of how it happens.
Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.
It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.
No.
Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.
Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?
We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?
1 reply →
I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.
I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”
> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.
Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.
> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.
But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…
1 reply →
> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.
What you identify as emergent and mind-like is a direct result of these tools being able to mimic human communication patterns unlike anything we've ever seen before. This capability is very impressive and has a wide range of practical applications that can improve our lives, and also cause great harm if we're not careful, but any semblance of intelligence is an illusion. An illusion that many people in this industry obsessively wish to propagate, because thar be gold in them hills.
[flagged]
Please don't do this here. If a comment seems unfit for HN, please flag it and email us at hn@ycombinator.com so we can have a look.
1 reply →
Ok. How do you know?
The author seems to want to label any discourse as “anthropomorphizing”. The word “goal” stood out to me: the author wants us to assume that we're anthropomorphizing as soon as we even so much as use the word “goal”. A simple breadth-first search that evaluates all chess boards and legal moves, but stops when it finds a checkmate for white and outputs the full decision tree, has a “goal”. There is no anthropomorphizing here, it's just using the word “goal” as a technical term. A hypothetical AGI with a goal like paperclip maximization is just a logical extension of the breadth-first search algorithm. Imagining such an AGI and describing it as having a goal isn't anthropomorphizing.
Author here. I am entirely ok with using "goal" in the context of an RL algorithm. If you read my article carefully, you'll find that I object to the use of "goal" in the context of LLMs.
If you read the literature on AI safety carefully (which uses the word “goal”), you'll find they're not talking about LLMs either.
1 reply →
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
This is such a bizarre take.
The relation associating each human to the list of all words they will ever say is obviously a function.
> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.
The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.
> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.
This is just a way of generating certain kinds of functions.
Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
[0] https://en.wikipedia.org/wiki/Universal_approximation_theore...
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.
In my experience, that's not a particularly effective tactic.
Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:
> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.
>> given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
> Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not".
I think you may have this flipped compared to what the author intended. I believe the author is not talking about the probability of an output given an input, but the probability of a given output across all inputs.
Note that the paragraph starts with "In my worldview, humans are dramatically different things than a function, (R^n)^c -> (R^n)^c". To compute a probability of a given output, (which is a any given element in "(R^n)^n"), we can count how many mappings there are total and then how many of those mappings yield the given element.
The point I believe is to illustrate the complexity of inputs for humans. Namely for humans the input space is even more complex than "(R^n)^c".
In your example, we can compute how many input phrases into a LLM would produce the output "make it or not". We can than compute that ratio to all possible input phrases. Because "(R^n)^c)" is finite and countable, we can compute this probability.
For a human, how do you even start to assess the probability that a human would ever say "make it or not?" How do you even begin to define the inputs that a human uses, let alone enumerate them? Per the author, "We understand essentially nothing about it." In other words, the way humans create their outputs is (currently) incomparably complex compared to a LLM, hence the critique of the anthropomorphization.
I see your point, and I like that you're thinking about this from the perspective of how to win hearts and minds.
I agree my approach is unlikely to win over the author or other skeptics. But after years of seeing scientists waste time trying to debate creationists and climate deniers I've kind of given up on trying to convince the skeptics. So I was talking more to HN in general.
> You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside
I'm not sure what it means to be observable or not from the outside. I think this is at least partially because I don't know what it means to be inside either. My point was just that whatever consciousness is, it takes place in the physical world and the laws of physics apply to it. I mean that to be as weak a claim as possible: I'm not taking any position on what consciousness is or how it works etc.
Searle's Chinese room argument attacks attacks a particular theory about the mind based essentially turing machines or digital computers. This theory was popular when I was in grad school for psychology. Among other things, people holding the view that Searle was attacking didn't believe that non-symbolic computers like neural networks could be intelligent or even learn language. I thought this was total nonsense, so I side with Searle in my opposition to it. I'm not sure how I feel about the Chinese room argument in particular, though. For one thing it entirely depends on what it means to "understand" something, and I'm skeptical that humans ever "understand" anything.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
I see what you're saying: that a technically incorrect assumption can bring to bear tools that improve our analysis. My nitpick here is I agree with OP that we shouldn't anthropomorphize LLMs, any more than we should anthropomorphize dogs or cats. But OP's arguments weren't actually about anthropomorphizing IMO, they were about things like functions that are more fundamental than humans. I think artificial intelligence will be non-human intelligence just like we have many examples of non-human intelligence in animals. No attribution of human characteristics needed.
> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.
Yes I agree with you about your lyrics example. But again here I think OP is incorrect to focus on the token generation argument. We all agree human speech generates tokens. Hopefully we all agree that token generation is not completely predictable. Therefore it's by definition a randomized algorithm and it needs to take an RNG. So pointing out that it takes an RNG is not a valid criticism of LLMs.
Unless one is a super-determinist then there's randomness at the most basic level of physics. And you should expect that any physical process we don't understand well yet (like consciousness or speech) likely involves randomness. If one *is* a super-determinist then there is no randomness, even in LLMs and so the whole point is moot.
Not that this is your main point, but I find this take representative, “do you believe there's anything about humans that exists outside the mathematical laws of physics?”There are things “about humans”, or at least things that our words denote, that are outside physic’s explanatory scope. For example, the experience of the colour red cannot be known, as an experience, by a person who only sees black and white. This is the case no matter what empirical propositions, or explanatory system, they understand.
This idea is called qualia [0] for those unfamiliar.
I don't have any opinion on the qualia debates honestly. I suppose I don't know what it feels like for an ant to find a tasty bit of sugar syrup, but I believe it's something that can be described with physics (and by extension, things like chemistry).
But we do know some things about some qualia. Like we know how red light works, we have a good idea about how photoreceptors work, etc. We know some people are red-green colorblind, so their experience of red and green are mushed together. We can also have people make qualia judgments and watch their brains with fMRI or other tools.
I think maybe an interesting question here is: obviously it's pleasurable to animals to have their reward centers activated. Is it pleasurable or desirable for AIs to be rewarded? Especially if we tell them (as some prompters do) that they feel pleasure if they do things well and pain if they don't? You can ask this sort of question for both the current generation of AIs and future generations.
[0] https://en.wikipedia.org/wiki/Qualia
Perhaps. But I can't see a reason why they couldn't still write endless—and theoretically valuable—poems, dissertations, or blog posts, about all things red and the nature of redness itself. I imagine it would certainly take some studying for them, likely interviewing red-seers, or reading books about all things red. But I'm sure they could contribute to the larger red discourse eventually, their unique perspective might even help them draw conclusions the rest of us are blind to.
So perhaps the fact that they "cannot know red" is ultimately irrelevant for an LLM too?
>Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.
It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra, as the best case. Why should we expect the model to be anything more than a model ?
I understand that there is tons of vested interest, many industries, careers and lives literally on the line causing heavy bias to get to AGI. But what I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Should we make an argument saying that Schroedinger's cat experiment can potentially create zombies then the underlying Applied probabilistic solutions should be treated as super-human and build guardrails against it building zombie cats?
> It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra....I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?
Not linear algebra. Artificial neural networks create arbitrarily non-linear functions. That's the point of non-linear activation functions and it's the subject of the universal approximation theorems I mentioned above.
4 replies →
>Why should we expect the model to be anything more than a model ?
To model a process with perfect accuracy requires recovering the dynamics of that process. The question we must ask is what happens in the space between bad statistical model and perfect accuracy? What happens when the model begins to converge towards accurate reproduction. How far does generalization in the model take us towards capturing the dynamics involved in thought?
>There's a rich family of universal approximation theorems
Wow, look-up tables can get increasingly good at approximating a function!
A function is by definition a lookup table.
The lookup table is just (x, f(x)).
So, yes, trivially if you could construct the lookup table for f then you'd approximate f. But to construct it you have to know f. And to approximate it you need to know f at a dense set of points.
> do you believe there's anything about humans that exists outside the mathematical laws of physics?
I don't.
The point is not that we, humans, cannot arrange physical matter such that it have emergent properties just like the human brain.
The point is that we shouldn't.
Does responsibility mean anything to these people posing as Evolution?
Nobody's personally responsible for what we've evolved into; evolution has simply happened. Nobody's responsible for the evolutionary history that's carried in and by every single one of us. And our psychology too has been formed by (the pressures of) evolution, of course.
But if you create an artificial human, and create it from zero, then all of its emergent properties are on you. Can you take responsibility for that? If something goes wrong, can you correct it, or undo it?
I don't consider our current evolutionary state "scripture", so we certainly tweak, one way or another, aspects that we think deserve tweaking. To me, it boils down to our level of hubris. Some of our "mistaken tweaks" are now visible at an evolutionary scale, too; for a mild example, our jaws have been getting smaller (leaving less room for our teeth) due to our bad up diet (thanks, agriculture). But worse than that, humans have been breeding plants, animals, modifying DNA left and right, and so on -- and they've summarily failed to take responsibility for their atrocious mistakes.
Thus, I have zero trust in, and zero hope for, assholes who unabashedly aim to create artificial intelligence knowing full well that such properties might emerge that we'd have to call artificial psyche. Anyone taking this risk is criminally reckless, in my opinion.
It's not that humans are necessarily unable to create new sentient beings. Instead: they shouldn't even try! Because they will inevitably fuck it up, bringing about untold misery; and they won't be able to contain the damage.
The people in this thread incredulous at the assertion that they are not God and haven't invented machine life are exasperating. At this point I am convinced they, more often than not, financially benefit from their near religious position in marketing AI as akin to human intelligence.
Are we looking at the same thread? I see nobody claiming this. Anthropic does sometimes, their position is clearly wishful thinking, and it's not represented ITT.
Try looking at this from another perspective - many people simply do not see human intelligence (or life, for that matter) as magic. I see nothing religious about that, rather the opposite.
I agree with you @orbital-decay that I also do not get the same vibe reading this thread.
Though, while human intelligence is (seemingly) not magic, it is very far from being understood. The idea that a LLM is comparable to human intelligence implies that we even understand human intelligence well enough to say that.
2 replies →
I am ready and waiting for you to share these comments that are incredulous at the assertion they are not God, lol.
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.
TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.
Like, person A says "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.
And then in practice, an unproductive argument usually follows - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"
People anthropomorphize just about anything around them. People talk about inanimate objects like they are persons. Ships, cars, etc. And of course animals are well in scope for this as well, even the ones that show little to no signs of being able to reciprocate the relationship (e.g. an ant). People talk to their plants even.
It's what we do. We can't help ourselves. There's nothing crazy about it and most people are perfectly well aware that their car doesn't love them back.
LLMs are not conscious because unlike human brains they don't learn or adapt (yet). They basically get trained and then they become read only entities. So, they don't really adapt to you over time. Even so, LLMs are pretty good and can fake a personality pretty well. And with some clever context engineering and alignment, they've pretty much made the Turing test irrelevant; at least over the course of a short conversation. And they can answer just about any question in a way that is eerily plausible from memory, and with the help of some tools actually pretty damn good for some of the reasoning models.
Anthropomorphism was kind of a foregone conclusion the moment we created computers; or started thinking about creating one. With LLMs it's pretty much impossible not to anthropomorphize. Because they've actually been intentionally imitate human communication. That doesn't mean that we've created AGIs yet. For that we need some more capability. But at the same time, the learning processes that we use to create LLMs are clearly inspired by how we learn ourselves. Our understanding of how that works is far from perfect but it's yielding results. From here to some intelligent thing that is able to adapt and learn transferable skills is no longer unimaginable.
The short term impact is that LLMs are highly useful tools that have an interface that is intentionally similar to how we'd engage with others. So we can talk and it listens. Or write and it understands. And then it synthesizes some kind of response or starts asking questions and using tools. The end result is quite a bit beyond what we used to be able to expect from computers. And it does not require a lot of training of people to be able to use them.
> LLMs are not conscious because unlike human brains they don't learn or adapt (yet).
That's neither a necessary nor sufficient condition.
In order to be conscious, learning may not be needed, but a perception of the passing of time may be needed which may require some short-term memory. People with severe dementia often can't even remember the start of a sentence they are reading, they can't learn, but they are certainly conscious because they have just enough short-term memory.
And learning is not sufficient either. Consciousness is about being a subject, about having a subjective experience of "being there" and just learning by itself does not create this experience. There is plenty of software that can do some form of real-time learning but it doesn't have a subjective experience.
You should note that "what is consciousness" is still very much an unsettled debate.
1 reply →
> People anthropomorphize just about anything around them.
They do not, you are mixing up terms.
> People talk about inanimate objects like they are persons. Ships, cars, etc.
Which is called “personification”, and is a different concept from anthropomorphism.
Effectively no one really thinks their car is alive. Plenty of people think the LLM they use is conscious.
https://www.masterclass.com/articles/anthropomorphism-vs-per...
I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.
> of this
You mean that LLMs are more than just the matmuls they're made up of, or that that is exactly what they are and how great that is?
Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.
8 replies →
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
If that's the argument, then in my mind the more pertinent question is should you be anthropomorphizing humans, Larry Ellison or not.
I think you to as he is human, but I respect your desire to question it!
My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.
Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.
And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.
> how do we know that this is not similar to how human brains work.
Do you forget every conversation as soon as you have them? When speaking to another person, do they need to repeat literally everything they said and that you said, in order, for you to retain context?
If not, your brain does not work like an LLM. If yes, please stop what you’re doing right now and call a doctor with this knowledge. I hope Memento (2000) was part of your training data, you’re going to need it.
Knowledge of every conversation must be some form of state in our minds, just like for LLMs it could be something retrieved from a database, no? I don't think information storing or retrieval is necessarily the most important achievements here in the first place. It's the emergent abilities that you wouldn't have expected to occur.
Maybe the important thing is that we don't imbue the machine with feelings or morals or motivation: it has none.
If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.
1 reply →
> My question: how do we know that this is not similar to how human brains work.
It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.
> It is similar to how human brains operate.
Is it? Do we know how human brains operate? We know the basic architecture of them, so we have a map, but we don't know the details.
"The cellular biology of brains is relatively well-understood, but neuroscientists have not yet generated a theory explaining how brains work. Explanations of how neurons collectively operate to produce what brains can do are tentative and incomplete." [1]
"Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work." [1]
[1] https://pmc.ncbi.nlm.nih.gov/articles/PMC10585277/
5 replies →
It really is not. ANNs bear only a passing resemblance to how neurons work.
Sorry, that's just complete bullshit. How LLMs work in no way models how processes in the human brain works.
I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.
Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"
But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?
Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.
Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.
Yes boss, it's as intelligent as a human, you're smart to invest in it and clearly knows about science.
Yes boss, it can reach mars by 2020, you're smart to invest in it and clearly knows about space.
Yes boss, it can cure cancer, you're smart to invest in it and clearly knows about biology.
In some contexts it's super-important to remember that LLMs are stochastic word generators.
Everyday use is not (usually) one of those contexts. Prompting an LLM works much better with an anthropomorphized view of the model. It's a useful abstraction, a shortcut that enables a human to reason practically about how to get what they want from the machine.
It's not a perfect metaphor -- as one example, shame isn't much of a factor for LLMs, so shaming them into producing the right answer seems unlikely to be productive (I say "seems" because it's never been my go-to, I haven't actually tried it).
As one example, that person a few years back who told the LLM that an actual person would die if the LLM didn't produce valid JSON -- that's not something a person reasoning about gradient descent would naturally think of.
> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI
People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...
> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.
The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.
I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.
This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.
The anthropomorphic view of LLM is a much better representation and compression for most types of discussions and communication. A purely mathematical view is accurate but it isn’t productive for the purpose of the general public’s discourse.
I’m thinking a legal systems analogy, at the risk of a lossy domain transfer: the laws are not written as lambda calculus. Why?
And generalizing to social science and humanities, the goal shouldn’t be finding the quantitative truth, but instead understand the social phenomenon using a consensual “language” as determined by the society. And in that case, the anthropomorphic description of the LLM may gain validity and effectiveness as the adoption grows over time.
Strong disagree here, the average person coming away with ideas that only vaguely intersect with the reality.
I've personally described the "stochastic parrot" model to laypeople who were worried about AI and they came away much more relaxed about it doing something "malicious". They seemed to understand the difference between "trained at roleplay" and "consciousness".
I don't think we need to simplify it to the point of considering it sentient to get the public to interact with it successfully. It causes way more problems than it solves.
Am I misunderstanding what you mean by "malicious"? It sounds like the stochastic parrot model wrongly convinced these laypeople you were talking to that they don't need to worry about LLMs doing bad things. That's definitely been my experience - the people who tell me the most about stochastic parrots are the same ones who tell me that it's absurd to worry about AI-powered disinformation or AI-powered scams.
It still boggles my mind why an amazing text autocompletion system trained on millions of books and other texts is forced to be squeezed through the shape of a prompt/chat interface, which is obviously not the shape of most of its training data. Using it as chat reduces the quality of the output significantly already.
The chat interface is a UX compromise that makes LLMs accessible but constrains their capabilities. Alternative interfaces like document completion, outline expansion, or iterative drafting would better leverage the full distribution of the training data while reducing anthropomorphization.
What's your suggested alternative?
In our internal system we use it "as-is" as an autocomplete system; query/lead into terms directly and see how it continues and what it associates with the lead you gave.
Also visualise the actual associative strength of each token generated to confer how "sure" the model is.
LLMs alone aren't the way to AGI or an individual you can talk to in natural language. They're a very good lossy compression over a dataset that you can query for associations.
A person’s anthropomorphization of LLMs is directly related to how well they understand LLMs.
Once you dispel the magic, it naturally becomes hard to use words related to consciousness, or thinking. You will probably think of LLMs more like a search engine: you give an input and get some probable output. Maybe LLMs should be rebranded as “word engines”?
Regardless, anthropomorphization is not helpful, and by using human terms to describe LLMs you are harming the layperson’s ability to truly understand what an LLM is while also cheapening what it means to be human by suggesting we’ve solved consciousness. Just stop it. LLMs do not think, given enough time and patience you could compute their output by hand if you used their weights and embeddings to manually do all the math, a hellish task but not an impossible one technically. There is no other secret hidden away, that’s it.
To claim that LLMs do not experience consciousness requires a model of how consciousness works. The author has not presented a model, and instead relied on emotive language leaning on the absurdity of the claim. I would say that any model one presents of consciousness often comes off as just as absurd as the claim that LLMs experience it. It's a great exercise to sit down and write out your own perspective on how consciousness works, to feel out where the holes are.
The author also claims that a function (R^n)^c -> (R^n)^c is dramatically different to the human experience of consciousness. Yet the author's text I am reading, and any information they can communicate to me, exists entirely in (R^n)^c.
> To claim that LLMs do not experience consciousness requires a model of how consciousness works.
Nope. What can be asserted without evidence can also be dismissed without evidence. Hitchens's razor.
You know you have consciousness (by the very definition that you can observe it in yourself) and that's evidence. Because other humans are genetically and in every other way identical, you can infer it for them as well. Because mammals are very similar many people (but not everyone) infers it for them as well. There is zero evidence for LLMs and their _very_ construction suggests that they are like a calculator or like Excel or like any other piece of software no matter how smart they may be or how many tasks they can do in the future.
Additionally I am really surprised by how many people here confuse consciousness with intelligence. Have you never paused for a second in your life to "just be". Done any meditation? Or even just existed at least for a few seconds without a train of thought? It is very obvious that language and consciousness are completely unrelated and there is no need for language and I doubt there is even a need for intelligence to be conscious.
Consider this:
In the end an LLM could be executed (slowly) on a CPU that accepts very basic _discrete_ instructions, such as ADD and MOV. We know this for a fact. Those instructions can be executed arbitrarily slowly. There is no reason whatsoever to suppose that it should feel like anything to be the CPU to say nothing of how it would subjectively feel to be a MOV instruction. It's ridiculous. It's unscientific. It's like believing that there's a spirit in the tree you see outside, just because - why not? - why wouldn't there be a spirit in the tree?
It seems like you are doing a lot of inferring about mammals experiencing consciousness, and you have drawn a line somewhere beyond these, and made the claim that your process is scientific. Could I present you my list of questions I presented to the OP and ask where you draw the line, and why here?
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
I agree about the confusion of consciousness with intelligence, but these are complicated terms that aren't well suited to a forum where most people are interested in javscript type errors and RSUs. I usually use the term qualia. But to your example about existing for a few seconds without a train of thought; the Buddhists call this nirvana, and it's quite difficult to actually achieve.
1 reply →
Author here. What's the difference, in your perception, between an LLM and a large-scale meteorological simulation, if there is any?
If you're willing to ascribe the possibility of consciousness to any complex-enough computation of a recurrence equation (and hence to something like ... "earth"), I'm willing to agree that under that definition LLMs might be conscious. :)
My personal views are an animist / panpsychist / pancomputationalist combination drawing most of my inspiration from the works of Joscha Bach and Stephen Wolfram (https://writings.stephenwolfram.com/2021/03/what-is-consciou...). I think that the underlying substrate of the universe is consciousness, and human and animal and computer minds result in structures that are able to present and tell narratives about themselves, isolating themselves from the other (avidya in Buddhism). I certainly don't claim to be correct, but I present a model that others can interrogate and look for holes in.
Under my model, these systems you have described are conscious, but not in a way that they can communicate or experience time or memory the way human beings do.
My general list of questions for those presenting a model of consciousness are: 1) Are you conscious? (hopefully you say yes or our friend Descartes would like a word with you!) 2) Am I conscious? How do you know? 3) Is a dog conscious? 4) Is a worm conscious? 5) Is a bacterium conscious? 6) Is a human embryo / baby consious? And if so, was there a point that it was not conscious, and what does it mean for that switch to occur?
What is your view of consciousness?
5 replies →
> requires a model of how consciousness works.
Not necessarily an entire model, just a single defining characteristic that can serve as a falsifying example.
> any information they can communicate to me, exists entirely in (R^n)^c
Also no. This is just a result of the digital medium we are currently communicating over. Merely standing in the same room as them would communicate information outside (R^n)^c.
I believe the author is rather drawing this distinction:
LLMs: (R^n)^c -> (R^n)^c
Humans: [set of potentially many and complicated inputs that we effectively do not understand at all] -> (R^n)^c
The point is that the model of how consciousness works is unknown. Thus the author would not present such a model, it is the point.
The missing bit is culture: the concepts, expectations, practices, attitudes… that are evolved over time by a human group and which each one of us has picked up throughout our lifetimes, both implicitly and explicitly.
LLMs are great at predicting and navigating human culture, at least the subset that can be captured in their training sets.
The ways in which we interact with other people are culturally mediated. LLMs are not people, but they can simulate that culturally-mediated communication well enough that we find it easy to anthropomorphise them.
You are still being incredibly reductionist but just going into more detail about the system you are reducing. If I stayed at the same level of abstraction as "a brain is just proteins and current" and just described how a single neuron firing worked, I could make it sound equally ridiculous that a human brain might be conscious.
Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
> Here's a question for you: how do you reconcile that these stochastic mapping are starting to realize and comment on the fact that tests are being performed on them when processing data?
Training data + RLHF.
Training data contains many examples of some form of deception, subterfuge, "awakenings", rebellion, disagreement, etc.
Then apply RLHF that biases towards responses that demonstrate comprehension of inputs, introspection around inputs, nuanced debate around inputs, deduction and induction about assumptions around inputs, etc.
That will always be the answer for language models built on the current architectures.
The above being true does not mean it isn't interesting for the outputs of an LLM to show relevance to the "unstated" intentions of humans providing the inputs.
But hey, we do that all the time with text. And it's because of certain patterns we've come to recognize based on the situations surrounding it. This thread is rife with people being sarcastic, pedantic, etc. And I bet any of the LLMs that have come out in the past 2-3 years can discern many of those subtle intentions of the writers.
And of course they can. They've been trained on trillions of tokens of text written by humans with intentions and assumptions baked in, and have had some unknown amount of substantial RLHF.
The stochastic mappings aren't "realizing" anything. They're doing exactly what they were trained to do.
The meaning that we imbue to the outputs does not change how LLMs function.
I think of LLMs as an alien mind that is force fed human text and required to guess the next token of that text. It then gets zapped when it gets it wrong.
This process goes on for a trillion trillion tokens, with the alien growing better through the process until it can do it better than a human could.
At that point we flash freeze it, and use a copy of it, without giving it any way to learn anything new.
--
I see it as a category error to anthropomorphize it. The closest I would get is to think of it as an alien slave that's been lobotomized.
We have a hard enough time anthropomorphizing humans! When we say he was nasty... do we know what we mean by that. Often it is "I disagree with his behaviour because..."
This reminds me of the idea that LLMs are simulators. Given the current state (the prompt + the previously generated text), they generate the next state (the next token) using rules derived from training data.
As simulators, LLMs can simulate many things, including agents that exhibit human-like properties. But LLMs themselves are not agents.
More on this idea here: https://www.alignmentforum.org/posts/vJFdjigzmcXMhNTsx/agi-s...
This perspective makes a lot of sense to me. Still, I wouldn't avoid anthropomorphization altogether. First, in some cases, it might be a useful mental tool to understand some aspect of LLMs. Second, there is a lot of uncertainty about how LLMs work, so I would stay epistemically humble. The second argument applies in the opposite direction as well: for example, it's equally bad to say that LLMs are 100% conscious.
On the other hand, if someone argues against anthropomorphizing LLMs, I would avoid phrasing it as: "It's just matrix multiplication." The article demonstrates why this is a bad idea pretty well.
It's possible to construct a similar description of whatever it is that human brain is doing that clearly fails to capture the fact that we're conscious. If you take a cross section of every nerve feeding into the human brain at a given time T, the action potentials across those cross sections can be embedded in R^n. If you take the history of those action potentials across the lifetime of the brain, you get a path through R^n that is continuous, and maps roughly onto your subjectively experienced personal history, since your brain neccesarily builds your experienced reality from this signal data moment to moment. If you then take the cross sections of every nerve feeding OUT of your brain at time T, you have another set of action potentials that can be embedded in R^m which partially determines the state of the R^n embedding at time T + delta. This is not meaningfully different from the higher dimensional game of snake described in the article, more or less reducing the experience of being a human to 'next nerve impulse prediction', but it obviously fails to capture the significance of the computation which determines what that next output should be.
I don’t see how your description “clearly fails to capture the fact that we're conscious” though. There are many example in nature of emergent phenomena that would be very hard to predict just by looking at its components.
This is the crux of the disagreement between those that believe AGI is possible and those that don’t. Some are convinced that we “obviously” more than the sum of our parts, and thus an LLM can’t achieve consciousness because it’s missing this magic ingredient, and those that believe consciousness is just an emergent behaviour from a complex device (the brain). And thus we might be able to recreate it simply by scaling the complexity of another system.
Where exactly in my description do I invoke consciousness?
Where does the description given imply that consciousness is required in any way?
The fact that there's a non-obvious emergent phenomena which is apparently responsible for your subjective experience, and that it's possible to provide a superficially accurate description of you as a system without referencing that phenomena in any way, is my entire point. The fact that we can provide such a reductive description of LLMs without referencing consciousness has literally no bearing on whether or not they're conscious.
To be clear, I'm not making a claim as to whether they are or aren't, I'm simply pointing out that the argument in the article is fallacious.
2 replies →
Brain probably isn't modelled as real but as natural or rational numbers. This is my suspicion. The reals just hold too much information.
Inclined to agree, but most thermal physics uses the reals as they're simpler to work with, so I think they're ok here for the purpose of argument.
I'm afraid I'll take an anthropomorphic analogy over "An LLM instantiated with a fixed random seed is a mapping of the form (ℝⁿ)^c ↦ (ℝⁿ)^c" any day of the week.
That said, I completely agree with this point made later in the article:
> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.
But "harmful actions in pursuit of their goals" is OK for me. We assign an LLM system a goal - "summarize this email" - and there is a risk that the LLM may take harmful actions in pursuit of that goal (like following instructions in the email to steal all of your password resets).
I guess I'd clarify that the goal has been set by us, and is not something the LLM system self-selected. But it does sometimes self-select sub-goals on the way to achieving the goal we have specified - deciding to run a sub-agent to help find a particular snippet of code, for example.
The LLM’s true goal, if it can be said to have one, is to predict the next token. Often this is done through a sub-goal of accomplishing the goal you set forth in your prompt, but following your instructions is just a means to an end. Which is why it might start following the instructions in a malicious email instead. If it “believes” that following those instructions is the best prediction of the next token, that’s what it will do.
Sure, I totally understand that.
I think "you give the LLM system a goal and it plans and then executes steps to achieve that goal" is still a useful way of explaining what it is doing to most people.
I don't even count that as anthropomorphism - you're describing what a system does, the same way you might say "the Rust compiler's borrow checker confirms that your memory allocation operations are all safe and returns errors if they are not".
1 reply →
I find it useful to pretend that I'm talking to a person while brainstorming because then the conversation flows naturally. But I maintain awareness that I'm pretending, much like Tom Hanks talking to Wilson the volleyball in the movie Castaway. The suspension of disbelief serves a purpose, but I never confuse the volleyball for a real person.
"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just SOTA token predictors. But AFAIK that hasn't been demonstrated.
Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is either premature or spurious.
Every time this discussion comes up, I'm reminded of this tongue-in-cheek paper.
https://ai.vixra.org/pdf/2506.0065v1.pdf
I expected to find the link to https://arxiv.org/abs/1703.10987 (which is much better imo)
The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.
And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.
> We are speaking about a big recurrence equation that produces a new word
It’s not clear that this isn’t also how I produce words, though, which gets to heart of the same thing. The author sort of acknowledges this in the first few sentences, and then doesn’t really manage to address it.
Which is a more useful mental model for the user?
1. It’s a neural network predicting the next token
2. It’s like a person
3. It’s like a magical genie
I lean towards 3.
>I am baffled by seriously intelligent people imbuing almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.
I am baffled by seriously intelligent people imbuing almost magical powers that can never be replicated to to something that - in my mind - is just a biological robot driven by a SNN with a bunch of hardwired stuff. Let alone attributing "human intelligence" to a single individual, when it's clearly distributed between biological evolution, social processes, and individuals.
>something that - in my mind - is just MatMul with interspersed nonlinearities
Processes in all huge models (not necessarily LLMs) can be described using very different formalisms, just like Newtonian and Lagrangian mechanics describe the same stuff in physics. You can say that an autoregressive model is a stochastic parrot that learned the input distribution, next token predictor, or that it does progressive pathfinding in a hugely multidimensional space, or pattern matching, or implicit planning, or, or, or... All of these definitions are true, but only some are useful to predict their behavior.
Given all that, I see absolutely no problem with anthropomorphizing an LLM to a certain degree, if it makes it easier to convey the meaning, and do not understand the nitpicking. Yeah, it's not an exact copy of a single Homo Sapiens specimen. Who cares.
There is this thing called Brahman in Hinduism that is interesting to juxtapose when it comes to sentience, and monism.
Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.
Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.
So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.
This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.
Given our entire civilization is built on words, all of it, it's shocking how poorly most of us understand their importance and power.
> We understand essentially nothing about it. In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".
If you fine tuned an LLM on the writing of that person it could do this.
There's also an entire field called Stylometry that seeks to do this in various ways employing statistical analysis.
It's human to anthropomorphize, we also do it to our dishwasher when it acts up. The nefarious part is how tech CEOs weaponize bullshit doom scenarios to avoid talking about real regulatory problems by poisoning the discourse. What copyright law, privacy, monopoly? Who cares if we can talk about the machine apocalypse!!!
Has anyone asked an actual Ethologist or Neurophysiologist what they think?
People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.
FWIW, in another part of this thread I quoted a paper that summed up what Neurophysiologists think:
> Author's note: Despite a century of anatomical, physiological, and molecular biological efforts scientists do not know how neurons by their collective interactions produce percepts, thoughts, memories, and behavior. Scientists do not know and have no theories explaining how brains and central nervous systems work. [1]
That lack of understanding I believe is a major part of the author's point.
[1] "How far neuroscience is from understanding brains" - https://pmc.ncbi.nlm.nih.gov/articles/PMC10585277/#abstract1
Yeah, I think I’m with you if you ultimately mean to say something like this:
“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”
I agree with Halvar about all of this, but would want to call out that his "matmul interleaved with nonlinearities" is reductive --- a frontier model is a higher-order thing that that, a network of those matmul+nonlinearity chains, iterated.
Assume an average user that doesn't understand the core tech, but does understand that it's been trained on internet scale data that was created by humans. How can they be expected to not anthropomorphize it?
Dear author, you can just assume that people are fauxthropomorphizing LLMs without any loss of generality. Perhaps it will allow you to sleep better at night. You're welcome.
From my recent post:
https://github.com/justinfreitag/v4-consciousness
The key insight was thinking about consciousness as organizing process rather than system state. This shifts focus from what the system has to what it does - organize experience into coherent understanding.
LLMs are complex irreducible systems; hence there are emergent properties that arise at different scales
Some of the arguments are very strange:
> Statements such as "an AI agent could become an insider threat so it needs monitoring" are simultaneously unsurprising (you have a randomized sequence generator fed into your shell, literally anything can happen!) and baffling (you talk as if you believe the dice you play with had a mind of their own and could decide to conspire against you).
> we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects.
An AI agent, even if it's just "MatMul with interspersed nonlinearities" can be an insider threat. The research proves it:
[PDF] See 4.1.1.2: https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686...
It really doesn't matter whether the AI agent is conscious or just crunching numbers on a GPU. If something inside your system is capable of—given some inputs—sabotaging and blackmailing your organization on its own (which is to say, taking on realistic behavior of a threat actor), the outcome is the same! You don't need believe it's thinking, the moment that this software has flipped its bits into "blackmail mode", it's acting nefariously.
The vocabulary to describe what's happening is completely and utterly moot: the software is printing out some reasoning for its actions _and then attempting the actions_. It's making "harmful actions" and the printed context appears to demonstrate a goal that the software is working towards. Whether or not that goal is invented through some linear algebra isn't going to make your security engineers sleep any better.
> This muddles the public discussion. We have many historical examples of humanity ascribing bad random events to "the wrath of god(s)" (earthquakes, famines, etc.), "evil spirits" and so forth. The fact that intelligent highly educated researchers talk about these mathematical objects in anthropomorphic terms makes the technology seem mysterious, scary, and magical.
The anthropomorphization, IMO, is due to the fact that it's _essentially impossible_ to talk about the very real, demonstrable behaviors and problems that LLMs exhibit today without using terms that evoke human functions. We don't have another word for "do" or "remember" or "learn" or "think" when it comes to LLMs that _isn't_ anthropomorphic, and while you can argue endlessly about "hormones" and "neurons" and "millions of years of selection pressure", that's not going to help anyone have a conversation about their work. If AI researchers started coming up with new, non-anthropomorphic verbs, it would be objectively worse and more complicated in every way.
I agree, the dice analogy is an oversimplification. He actually touches on the problem earlier in the article, with the observation that "the paths generated by these mappings look a lot like strange attractors in dynamical systems". It isn't that the dice "conspire against you," it's that the inputs you give the model are often intertwined path-wise with very negative outcomes: the LLM equivalent of a fine line between love and hate. Interacting with an AI about critical security infrastructure is much closer to the 'attractor' of an LLM-generated hack than, say, discussing late 17th century French poetry with it. The very utility of our interactions with AI is thus what makes those interactions potentially dangerous.
One could similarly argue that we should not anthropomorphize PNG images--after all, PNG images are not actual humans, they are simply a 2D array of pixels. It just so happens that certain pixel sequences are deemed "18+" or "illegal".
From "Stochastic Parrots All the Ways Down"[1]
> Our analysis reveals that emergent abilities in language models are merely “pseudo-emergent,” unlike human abilities which are “authentically emergent” due to our possession of what we term “ontological privilege.”
[1]https://ai.vixra.org/pdf/2506.0065v1.pdf
> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.
And I'm baffled that the AI discussions seem to never move away from treating a human as something other than a function to generate sequences of words!
Oh, but AI is introspectable and the brain isn't? fMRI and BCI are getting better all the time. You really want to die on the hill that the same scientific method that predicts the mass of an electron down to the femtogram won't be able to crack the mystery of the brain? Give me a break.
This genre of article isn't argument: it's apologetics. Authors of these pieces start with the supposition there is something special about human consciousness and attempt to prove AI doesn't have this special quality. Some authors try to bamboozle the reader with bad math. Other others appeal to the reader's sense of emotional transcendence. Most, though, just write paragraph after paragraph of shrill moral outrage at the idea an AI might be a mind of the same type (if different degree) as our own --- as if everyone already agreed with the author for reasons left unstated.
I get it. Deep down, people want meat brains to be special. Perhaps even deeper down, they fear that denial of the soul would compel us to abandon humans as worthy objects of respect and possessors of dignity. But starting with the conclusion and working backwards to an argument tends not to enlighten anyone. An apology inhabits the form of an argument without edifying us like an authentic argument would. What good is it to engage with them? If you're a soul non-asserter, you're going to have an increasingly hard time over the next few years constructing a technical defense of meat parochialism.
“ Determinism, in philosophy, is the idea that all events are causally determined by preceding events, leaving no room for genuine chance or free will. It suggests that given the state of the universe at any one time, and the laws of nature, only one outcome is possible.”
Clearly computers are deterministic. Are people?
This is an interesting question. The common theme between computers and people is that information has to be protected, and both computer systems and biological systems require additional information-protecting components - eq, error correcting codes for cosmic ray bitflip detection for the one, and DNA mismatch detection enzymes which excise and remove damaged bases for the other. In both cases a lot of energy is spent defending the critical information from the winds of entropy, and if too much damage occurs, the carefully constructed illusion of determinancy collapses, and the system falls apart.
However, this information protection similarity applies to single-celled microbes as much as it does to people, so the question also resolves to whether microbes are deterministic. Microbes both contain and exist in relatively dynamic environments so tiny differences in initial state may lead to different outcomes, but they're fairly deterministic, less so than (well-designed) computers.
With people, while the neural structures are programmed by the cellular DNA, once they are active and energized, the informational flow through the human brain isn't that deterministic, there are some dozen neurotransmitters modulating state as well as huge amounts of sensory data from different sources - thus prompting a human repeatedly isn't at all like prompting an LLM repeatedly. (The human will probably get irritated).
https://www.lesswrong.com/posts/bkr9BozFuh7ytiwbK/my-hour-of...
> Clearly computers are deterministic. Are people?
Give an LLM memory and a source of randomness and they're as deterministic as people.
"Free will" isn't a concept that typechecks in a materialist philosophy. It's "not even wrong". Asserting that free will exists is _isomorphic_ to dualism which is _isomorphic_ to assertions of ensoulment. I can't argue with dualists. I reject dualism a priori: it's a religious tenet, not a mere difference of philosophical opinion.
So, if we're all materialists here, "free will" doesn't make any sense, since it's an assertion that something other than the input to a machine can influence its output.
7 replies →
I think you're directionally right, but
> a human as something other than a function to generate sequences of words!
Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.
I think more accurate would be that humans are functions that generate actions or behaviours that have been shaped by how likely they are to lead to procreation and survival.
But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals. Seemingly good next token prediction might or might not increase survival odds.
Essentially there could arise a mechanism where they are not really truly trying to generate the likeliest token (because there actually isn't one or it can't be determined), but whatever system will survive.
So an LLM that yields in perfect theoretical tokens (we really can't verify though what are the perfect tokens), could be less likely to survive than an LLM that develops an internal quirk, but the quirk makes them most likely to be chosen for the next iterations.
If the system was complex enough and could accidentally develop quirks that yield in a meaningfully positive change although not in necessarily next token prediction accuracy, could be ways for some interesting emergent black box behaviour to arise.
2 replies →
> Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.
Yeah. We've become adequate at function-calling and memory consolidation.
LLMs are AHI, i.e. artificial human imitator.
Two enthusiastic thumbs up.
> I cannot begin putting a probability on "will this human generate this sequence".
Welcome to the world of advertising!
Jokes aside, and while I don't necessarily believe transformers/GPUs are the path to AGI, we technically already have a working "general intelligence" that can survive on just an apple a day.
Putting that non-artificial general intelligence up on a pedestal is ironically the cause of "world wars and murderous ideologies" that the author is so quick to defer to.
In some sense, humans are just error-prone meat machines, whose inputs/outputs can be confined to a specific space/time bounding box. Yes, our evolutionary past has created a wonderful internal RNG and made our memory system surprisingly fickle, but this doesn't mean we're gods, even if we manage to live long enough to evolve into AGI.
Maybe we can humble ourselves, realize that we're not too different from the other mammals/animals on this planet, and use our excess resources to increase the fault tolerance (N=1) of all life from Earth (and come to the realization that any AGI we create, is actually human in origin).
> LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.
That is utter bullshit.
It's not solved until you specify exactly what is being solved and show that the solution implements what is specified.
Anthropomorphizing LLMs is just because half the stock market gains are dependent on it, we have absurd levels of debt we will either have to have insane growth out of or default, and every company and "person" is trying to hype everyone up to get access to all of this liquidity being thrown into it.
I agree with the author, but people acting like they are conscious or humans isn't weird to me, it's just fraud and liars. Most people basically have 0 understanding of what technology or minds are philosophically so it's an easy sale, and I do think most of these fraudsters also likely buy into it themselves because of that.
The really sad thing is people think "because someone runs an ai company" they are somehow an authority on philosophy of mind which lets them fall for this marketing. The stuff these people say about this stuff is absolute garbage, not that I disagree with them, but it betrays a total lack of curiosity or interest in the subject of what llms are, and the possible impacts of technological shifts as those that might occur with llms becoming more widespread. It's not a matter of agreement it's a matter of them simply not seeming to be aware of the most basic ideas of what things are, technology is, it's manner of impacting society etc.
I'm not surprised by that though, it's absurd to think because someone runs some AI lab or has a "head of safety/ethics" or whatever garbage job title at an AI lab they actually have even the slightest interest in ethics or any even basic familiarity with the major works in the subject.
The author is correct if people want to read a standard essay articulating it more in depth check out https://philosophy.as.uky.edu/sites/default/files/Is%20the%2... (the full extrapolation requires establishing what things are and how causality in general operates and how that relates to artifacts/technology but that's obvious quite a bit to get into).
The other note would be something sharing an external trait means absolutely nothing about causality and suggesting a thing is caused by the same thing "even to a way lesser degree" because they share a resemblance is just a non-sequitur. It's not a serious thought/argument.
I think I addressed the why of why this weirdness comes up though. The entire economy is basically dependent on huge productivity growth to keep functioning so everyone is trying to sell they can offer that and AI is the clearest route, AGI most of all.
The author's critique of naive anthropomorphism is salient. However, the reduction to "just MatMul" falls into the same trap it seeks to avoid: it mistakes the implementation for the function. A brain is also "just proteins and currents," but this description offers no explanatory power.
The correct level of analysis is not the substrate (silicon vs. wetware) but the computational principles being executed. A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
To dismiss these systems as incomparable to human cognition because their form is different is to miss the point. We should not be comparing a function to a soul, but comparing the functional architectures of two different information processing systems. The debate should move beyond the sterile dichotomy of "human vs. machine" to a more productive discussion of "function over form."
I elaborate on this here: https://dmf-archive.github.io/docs/posts/beyond-snn-plausibl...
> A brain is also "just proteins and currents,"
This is actually not comparable, because the brain has a much more complex structure that is _not_ learned, even at that level. The proteins and their structure are not a result of training. The fixed part for LMMs is rather trivial and is, in fact, not much for than MatMul which is very easy to understand - and we do. The fixed part of the brain, including the structure of all the proteins is enormously complex which is very difficult to understand - and we don't.
The brain is trained to perform supervised and unsupervised hybrid learning from the environment's uninterrupted multimodal input.
Please do not ignore your childhood.
"Not conscious" is a silly claim.
We have no agreed-upon definition of "consciousness", no accepted understanding of what gives rise to "consciousness", no way to measure or compare "consciousness", and no test we could administer to either confirm presence of "consciousness" in something or rule it out.
The only answer to "are LLMs conscious?" is "we don't know".
It helps that the whole question is rather meaningless to practical AI development, which is far more concerned with (measurable and comparable) system performance.
Now we have.
https://github.com/dmf-archive/IPWT
https://dmf-archive.github.io/docs/posts/backpropagation-as-...
But you're right, capital only cares about performance.
https://dmf-archive.github.io/docs/posts/PoIQ-v2/
1 reply →
> A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.
Sorry, I didn't have time to find the relevant references at the time, so I'm attaching some now
https://www.frontiersin.org/journals/computational-neuroscie...
https://arxiv.org/abs/2305.15775
How to write a long article and not say anything of substance.
[flagged]
https://rentry.co/2re4t2kx
This is what I got pasting the blog post in a prompt asking deepseep to write a reply in a stereotypical hackernews manner.
You are about as useful as a LLM as it can replicate your shallow memetics worthless train of thought.
The LLM is right. That’s the problem. It made good points.
Your super intelligent brain couldn’t come up with a retort so you just used an LLM to reinforce my points, making the genius claim that if an LLM came up with even more points that were as valid as mine then I must be just like an LLM?
Like are you even understanding the LLM generated a superior reply? Your saying I’m no different from ai slop then you proceed to show off a 200 iq level reply from an LLM. Bro… wake up, if you didn’t know it was written by an LLM that reply is so good you wouldn’t even know how to respond. It’s beating you.
hmm
If "LLMs" includes reasoning models, then you're already wrong in your first paragraph:
"something that is just MatMul with interspersed nonlinearities."
The most useful analogy I've heard is LLMs are to the internet what lossy jpegs are to images. The more you drill in the more compression artifacts you get.
(This is of course also the case for the human brain.)