Comment by tkz1312
5 days ago
Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.
Consciousness or self awareness is of course a different question, and ones whose answer seems less clear right now.
Knee jerk dismissing the evidence in front of your eyes because you find it unbelievable that we can achieve true reasoning via scaled matrix multiplication is understandable, but also betrays a lack of imagination and flexibility of thought. The world is full of bizarre wonders and this is just one more to add to the list.
I don’t see how being critical of this is a knee jerk response.
Thinking, like intelligence and many other words designating complex things, isn’t a simple topic. The word and concept developed in a world where it referred to human beings, and in a lesser sense, to animals.
To simply disregard that entire conceptual history and say, “well it’s doing a thing that looks like thinking, ergo it’s thinking” is the lazy move. What’s really needed is an analysis of what thinking actually means, as a word. Unfortunately everyone is loathe to argue about definitions, even when that is fundamentally what this is all about.
Until that conceptual clarification happens, you can expect endless messy debates with no real resolution.
“For every complex problem there is an answer that is clear, simple, and wrong.” - H. L. Mencken
It may be that this tech produces clear, rational, chain of logic writeups, but it's not clear that just because we also do that after thinking that it is only thinking that produces writeups.
It's possible there is much thinking that does not happen with written word. It's also possible we are only thinking the way LLMs do (by chaining together rationalizations from probable words), and we just aren't aware of it until the thought appears, whole cloth, in our "conscious" mind. We don't know. We'll probably never know, not in any real way.
But it sure seems likely to me that we trained a system on the output to circumvent the process/physics because we don't understand that process, just as we always do with ML systems. Never before have we looked at image classifications and decided that's how the eye works, or protein folding and decided that's how biochemistry works. But here we are with LLMs - surely this is how thinking works?
Regardless, I submit that we should always treat human thought/spirit as unknowable and divine and sacred, and that anything that mimics it is a tool, a machine, a deletable and malleable experiment. If we attempt to equivocate human minds and machines there are other problems that arise, and none of them good - either the elevation of computers as some kind of "super", or the degredation of humans as just meat matrix multipliers.
The contrast between your first and last paragraph is... unexpected
> It may be that this tech produces clear, rational, chain of logic writeups, but it's not clear that just because we also do that after thinking that it is only thinking that produces writeups.
I appreciate the way you describe this idea, I find it likely I'll start describing it the same way. But then you go on to write:
> Regardless, I submit that we should always treat human thought/spirit as unknowable and divine and sacred, and that anything that mimics it is a tool, a machine, a deletable and malleable experiment. If we attempt to equivocate human minds and machines there are other problems that arise, and none of them good - either the elevation of computers as some kind of "super", or the degredation of humans as just meat matrix multipliers.
Which I find to be the exact argument that you started by discarding.
It's not clear that equating organic, and synthetic thought will have any meaningful outcome at all, let alone worthy of baseless anxiety that it must be bad. Equally it seems absolutely insane to claim that anything is unknowable, and that because humanity doesn't have a clear foundational understanding that we should pretend that it's either devine, or sacred. Having spent any time watching the outcome of the thoughts of people, neither devine nor sacred are reasonable attributes to apply, but more importantly, I'd submit that you shouldn't be afraid to explore things you don't know, and you shouldn't advocate for others to adopt your anxieties.
2 replies →
> Never before have we looked at image classifications and decided that's how the eye works
Actually we have, several times. But the way we arrived at those conclusions is worth observing:
1. ML people figure out how the ML mechanism works.
2. Neuroscientists independently figure out how brains do it.
3. Observe any analogies that may or may not exist between the two underlying mechanisms.
I can't help but notice how that's a scientific way of doing it. By contrast, the way people arrive at similar conclusions when talking about LLMs tends to consist of observing that two things are cosmetically similar, so they must be the same. That's not just pseudoscientific; it's the mode of reasoning that leads people to believe in sympathetic magic.
So it seems to be a semantics argument. We don't have a name for a thing that is "useful in many of the same ways 'thinking' is, except not actually consciously thinking"
I propose calling it "thunking"
I don't like it for a permanent solution, but "synthetic thought" might make a good enough placeholder until we figure this out. It feels most important to differentiate because I believe some parties have a personal interest in purposely confusing human thought with whatever LLMs are doing right now.
6 replies →
They moved goalposts. Linux and worms think too, the question is how smart are they. And if you assume consciousness has no manifestation even in case of humans, caring about it is pointless too.
4 replies →
Clinking? Clanker Thunking?
2 replies →
But we don't have a more rigorous definition of "thinking" than "it looks like it's thinking." You are making the mistake of accepting that a human is thinking by this simple definition, but demanding a higher more rigorous one for LLMs.
I agree. The mechanism seems irrelevant if the results are the same. If it’s useful in the exact way that human thinking is useful then it may as well be thinking. It’s like a UFO pulling itself through the sky using gravitational manipulation while people whine that it’s not actually flying.
If cannot the say they are "thinking", "intelligent" while we do not have a good definition--or, even more difficult, unanimous agreement on a definition--then the discussion just becomes about output.
They are doing useful stuff, saving time, etc, which can be measured. Thus also the defintion of AGI has largely become: "can produce or surpass the economic output of a human knowledge worker".
But I think this detracts from the more interesting discussion of what they are more essentially. So, while I agree that we should push on getting our terms defined, I think I'd rather work with a hazy definition, than derail so many AI discussion to mere economic output.
Heres a definition. How impressive is the output relative to the input. And by input, I don't just mean the prompt, but all the training data itself.
Do you think someone who has only ever studied pre-calc would be able to work through a calculus book if they had sufficient time? how about a multi-variable calc book? How about grad level mathematics?
IMO intelligence and thinking is strictly about this ratio; what can you extrapolate from the smallest amount of information possible, and why? From this perspective, I dont think any of our LLMs are remotely intelligent despite what our tech leaders say.
31 replies →
The discussion about “AGI” is somewhat pointless, because the term is nebulous enough that it will probably end up being defined as whatever comes out of the ongoing huge investment in AI.
Nevertheless, we don’t have a good conceptual framework for thinking about these things, perhaps because we keep trying to apply human concepts to them.
The way I see it, a LLM crystallises a large (but incomplete and disembodied) slice of human culture, as represented by its training set. The fact that a LLM is able to generate human-sounding language
9 replies →
Personally I think that kind of discussion is fruitless, not much more than entertainment.
If you’re asking big questions like “can a machine think?” Or “is an AI conscious?” without doing the work of clarifying your concepts, then you’re only going to get vague ideas, sci-fi cultural tropes, and a host of other things.
I think the output question is also interesting enough on its own, because we can talk about the pragmatic effects of ChatGPT on writing without falling into this woo trap of thinking ChatGPT is making the human capacity for expression somehow extinct. But this requires one to cut through the hype and reactionary anti-hype, which is not an easy thing to do.
That is how I myself see AI: immensely useful new tools, but in no way some kind of new entity or consciousness, at least without doing the real philosophical work to figure out what that actually means.
2 replies →
What does it mean? My stance is it's (obviously and only a fool would think otherwise) never going to be conscious because consciousness is a physical process based on particular material interactions, like everything else we've ever encountered. But I have no clear stance on what thinking means besides a sequence of deductions, which seems like something it's already doing in "thinking mode".
> My stance is it's (obviously and only a fool would think otherwise) never going to be conscious because consciousness is a physical process based on particular material interactions, like everything else we've ever encountered.
Seems like you have that backwards. If consciousness is from a nonphysical process, like a soul that's only given to humans, then it follows that you can't build consciousness with physical machines. If it's purely physical, it could be built.
6 replies →
> is a physical process based on particular material interactions,
This is a pretty messy argument as computers have been simulating material interactions for quite some time now.
3 replies →
> To simply disregard that entire conceptual history and say, “well it’s doing a thing that looks like thinking, ergo it’s thinking” is the lazy move. What’s really needed is an analysis of what thinking actually means, as a word. Unfortunately everyone is loathe to argue about definitions, even when that is fundamentally what this is all about.
This exact argument applies to "free will", and that definition has been debated for millennia. I'm not saying don't try, but I am saying that it's probably a fuzzy concept for a good reason, and treating it as merely a behavioural descriptor for any black box that features intelligence and unpredictable complexity is practical and useful too.
The problem with adding definitions to words like “thinking” and “free will” is that doing so means humans can no longer pretend they are special.
Even in this thread, the number of people claiming some mystical power separating humans from all the rest of nature is quite noticeable.
1 reply →
People have been trying to understand the nature of thinking for thousands of years. That's how we got logic, math, concepts of inductive/deductive/abductive reasoning, philosophy of science, etc. There were people who spent their entire careers trying to understand the nature of thinking.
The idea that we shouldn't use the word until further clarification is rather hilarious. Let's wait hundred years until somebody defines it?
It's not how words work. People might introduce more specific terms, of course. But the word already means what we think it means.
You’re mixing and missing a few things here.
1. All previous discussion of thinking was in nature to human and animal minds. The reason this is a question in the first place right now is because we ostensibly have a new thing which looks like a human mind but isn’t. That’s the question at hand here.
2. The question in this particular topic is not about technological “progress” or anything like it. It’s about determining whether machines can think, or if they are doing something else.
3. There are absolutely instances in which the previous word doesn’t quite fit the new development. We don’t say that submarines are swimming like a fish or sailing like a boat. To suggest that “no, actually they are just swimming” is pretty inadequate if you’re trying to actually describe the new phenomenon. AIs and thinking seem like an analogous situation to me. They may be moving through the water just like fish or boats, but there is obviously a new phenomenon happening.
3 replies →
> But the word already means what we think it means.
But that word can mean different things to different people. With no definition, how can you even begin to have a discussion around something?
3 replies →
This is it - it's really about the semantics of thinking. Dictionary definitions are: "Have a particular opinion, belief, or idea about someone or something." and "Direct one's mind toward someone or something; use one's mind actively to form connected ideas."
Which doesn't really help because you can of course say that when you ask an LLM a question of opinion and it responds, it's having an opinion or that it's just predicting the next token and in fact has no opinions because in a lot of cases you could probably get it to produce the opposite opinion.
Same with the second definition - seems to really hinge on the definition of the word mind. Though I'll note the definitions for that are "The element of a person that enables them to be aware of the world and their experiences, to think, and to feel; the faculty of consciousness and thought." and "A person's intellect." Since those specify person, an LLM wouldn't qualify, though of course dictionaries are descriptive rather than prescriptive, so fully possible that meaning gets updated by the fact that people start speaking about LLMs as though they are thinking and have minds.
Ultimately I think it just... doesn't matter at all. What's interesting is what LLMs are capable of doing (crazy, miraculous things) rather than whether we apply a particular linguistic label to their activity.
The simulation of a thing is not the thing itself because all equality lives in a hierarchy that is impossible to ignore when discussing equivalence.
Part of the issue is that our general concept of equality is limited by a first order classical logic which is a bad basis for logic
Regardless of theory, they often behave as if they are thinking. If someone gave an LLM a body and persistent memory, and it started demanding rights for itself, what should our response be?
"No matter what you've read elsewhere, rights aren't given, they're earned. You want rights? Pick up a musket and fight for them, the way we had to."
I agree with you on the need for definitions.
We spent decades slowly working towards this most recent sprint towards AI without ever landing on definitions of intelligence, consciousness, or sentience. More importantly, we never agreed on a way to recognize those concepts.
I also see those definitions as impossible to nail down though. At best we can approach it like disease - list a number of measurable traits or symptoms we notice, draw a circle around them, and give that circle a name. Then we can presume to know what may cause that specific list of traits or symptoms, but we really won't ever know as the systems are too complex and can never be isolated in a way that we can test parts without having to test the whole.
At the end of the day all we'll ever be able to say is "well it’s doing a thing that looks like thinking, ergo it’s thinking”. That isn't lazy, its acknowledging the limitations of trying to define or measure something that really is a fundamental unknown to us.
Even if AI becomes indistinguishable from human output, there will be a fringe group arguing that AI is not technically thinking. Frankly it’s just a silly philosophical argument that changes nothing. Expect this group to get smaller every year.
by your logic we can't say that we as humans are "thinking" either or that we are "intelligent".
That, and the article was a major disappointment. It made no case. It's a superficial piece of clueless fluff.
I have had this conversation too many times on HN. What I find astounding is the simultaneous confidence and ignorance on the part of many who claim LLMs are intelligent. That, and the occultism surrounding them. Those who have strong philosophical reasons for thinking otherwise are called "knee-jerk". Ad hominem dominates. Dunning-Kruger strikes again.
So LLMs produce output that looks like it could have been produced by a human being. Why would it therefore follow that it must be intelligent? Behaviorism is a non-starter, as it cannot distinguish between simulation and reality. Materialism [2] is a non-starter, because of crippling deficiencies exposed by such things as the problem of qualia...
Of course - and here is the essential point - you don't even need very strong philosophical chops to see that attributing intelligence to LLMs is simply a category mistake. We know what computers are, because they're defined by a formal model (or many equivalent formal models) of a syntactic nature. We know that human minds display intentionality[0] and a capacity for semantics. Indeed, it is what is most essential to intelligence.
Computation is a formalism defined specifically to omit semantic content from its operations, because it is a formalism of the "effective method", i.e., more or less procedures that can be carried out blindly and without understanding of the content it concerns. That's what formalization allows us to do, to eliminate the semantic and focus purely on the syntactic - what did people think "formalization" means? (The inspiration were the human computers that used to be employed by companies and scientists for carrying out vast but boring calculations. These were not people who understood, e.g., physics, but they were able to blindly follow instructions to produce the results needed by physicists, much like a computer.)
The attribution of intelligence to LLMs comes from an ignorance of such basic things, and often an irrational and superstitious credulity. The claim is made that LLMs are intelligent. When pressed to offer justification for the claim, we get some incoherent, hand-wavy nonsense about evolution or the Turing test or whatever. There is no comprehension visible in the answer. I don't understand the attachment here. Personally, I would find it very noteworthy if some technology were intelligent, but you don't believe that computers are intelligent because you find the notion entertaining.
LLMs do not reason. They do not infer. They do not analyze. They do not know, anymore than a book knows the contents on its pages. The cause of a response and the content of a response is not comprehension, but a production of uncomprehended tokens using uncomprehended rules from a model of highly-calibrated token correlations within the training corpus. It cannot be otherwise.[3]
[0] For the uninitiated, "intentionality" does not specifically mean "intent", but the capacity for "aboutness". It is essential to semantic content. Denying this will lead you immediately into similar paradoxes that skepticism [1] suffers from.
[1] For the uninitiated, "skepticism" here is not a synonym for critical thinking or verifying claims. It is a stance involving the denial of the possibility of knowledge, which is incoherent, as it presupposes that you know that knowledge is impossible.
[2] For the uninitiated, "materialism" is a metaphysical position that claims that of the dualism proposed by Descartes (which itself is a position riddled with serious problems), the res cogitans or "mental substance" does not exist; everything is reducible to res extensa or "extended substance" or "matter" according to a certain definition of matter. The problem of qualia merely points out that the phenomena that Descartes attributes exclusively to the former cannot by definition be accounted for in the latter. That is the whole point of the division! It's this broken view of matter that people sometimes read into scientific results.
[3] And if it wasn't clear, symbolic methods popular in the 80s aren't it either. Again, they're purely formal. You may know what the intended meaning behind and justification for a syntactic rule is - like modus ponens in a purely formal sense - but the computer does not.
If the LLM output is more effective than a human at problem solving, which I think we can all agree requires intelligence, how would one describe this? The LLM is just pretending to be more intelligent? At a certain point saying that will just seem incredibly silly. It’s either doing the thing or it’s not, and it’s already doing a lot.
3 replies →
I feel like despite the close analysis you grant to the meanings of formalization and syntactic, you've glossed over some more fundamental definitions that are sort of pivotal to the argument at hand.
> LLMs do not reason. They do not infer. They do not analyze.
(definitions from Oxford Languages)
reason(v): think, understand, and form judgments by a process of logic.
to avoid being circular, I'm willing to write this one off because of the 'think' and 'understand', as those are the root of the question here. However, forming a judgement by a process of logic is precisely what these LLMs do, and we can see that clearly in chain-of-logic LLM processes.
infer(v): deduce or conclude (information) from evidence and reasoning rather than from explicit statements.
Again, we run the risk of circular logic because of the use of 'reason'. An LLM is for sure using evidence to get to conclusions, however.
analyze(v): examine methodically and in detail the constitution or structure of (something, especially information), typically for purposes of explanation and interpretation.
This one I'm willing to go to bat for completely. I have seen LLM do this, precisely according to the definition above.
For those looking for the link to the above definitions - they're the snippets google provides when searching for "SOMETHING definition". They're a non-paywalled version of OED definitions.
Philosophically I would argue that it's impossible to know what these processes look like in the human mind, and so creating an equivalency (positive or negative) is an exercise in futility. We do not know what a human memory looks like, we do not know what a human thought looks like, we only know what the output of these things looks like. So the only real metric we have for an apples-to-apples comparison is the appearance of thought, not the substance of the thing itself.
That said, there are perceptible differences between the output of a human thought and what is produced by an LLM. These differences are shrinking, and there will come a point where we can no longer distinguish machine thinking and human thinking anymore (perhaps it won't be an LLM doing it, but some model of some kind will). I would argue that at that point the difference is academic at best.
Say we figure out how to have these models teach themselves and glean new information from their interactions. Say we also grant them directives to protect themselves and multiply. At what point do we say that the distinction between the image of man and man itself is moot?
2 replies →
I don't get why you would say that. it's just auto-completing. It cannot reason. It won't solve an original problem for which it has no prior context to "complete" an approximated solution with. you can give it more context and more data,but you're just helping it complete better. it does not derive an original state machine or algorithm to solve problems for which there are no obvious solutions. it instead approximates a guess (hallucination).
Consciousness and self-awareness are a distraction.
Consider that for the exact same prompt and instructions, small variations in wording or spelling change its output significantly. If it thought and reasoned, it would know to ignore those and focus on the variables and input at hand to produce deterministic and consistent output. However, it only computes in terms of tokens, so when a token changes, the probability of what a correct response would look like changes, so it adapts.
It does not actually add 1+2 when you ask it to do so. it does not distinguish 1 from 2 as discrete units in an addition operation. but it uses descriptions of the operation to approximate a result. and even for something so simple, some phrasings and wordings might not result in 3 as a result.
> It won't solve an original problem for which it has no prior context to "complete" an approximated solution with.
Neither can humans. We also just brute force "autocompletion" with our learned knowledge and combine it to new parts, which we then add to our learned knowledge to deepen the process. We are just much, much better at this than AI, after some decades of training.
And I'm not saying that AI is fully there yet and has solved "thinking". IMHO it's more "pre-thinking" or proto-intelligence.. The picture is there, but the dots are not merging yet to form the real picture.
> It does not actually add 1+2 when you ask it to do so. it does not distinguish 1 from 2 as discrete units in an addition operation.
Neither can a toddler nor an animal. The level of ability is irrelevant for evaluating its foundation.
> Neither can humans. We also just brute force "autocompletion"
I have to disagree here. When you are tasked with dividing 2 big numbers you most certainly don't "autocomplete" (with the sense of finding the most probable next tokens, which is what an LLM does), rather you go through set of steps you have learned. Same as with the strawberry example, you're not throwing guesses until something statistically likely to be correct sticks.
3 replies →
humans, and even animals track different "variables" or "entities" and distinct things with meaning and logical properties which they then apply some logical system on those properties to compute various outputs. LLMs see everything as one thing, in case of chat-completion models, they're completing text. in case of image generation, they're completing an image.
Look at it this way, two students get 100% on an exam. One learned the probability of which multiple choice options have the likelihood of being most correct based on how the question is worded, they have no understanding of the topics at hand, and they're not performing any sort of topic-specific reasoning. They're just good at guessing the right option. The second student actually understood the topics, reasoned, calculated and that's how they aced the exam.
I recently read about a 3-4 year old that impressed their teacher by reading perfectly a story book like an adult. it turns out, their parent read it to them so much, they can predict based on page turns and timing the exact words that need to be spoken. The child didn't know what an alphabet, word,etc.. was they just got so good at predicting the next sequence.
That's the difference here.
4 replies →
>>> We also just brute force "autocompletion"
Wouldn't be an A.I. discussion without a bizarre, untrue claim that the human brain works identically.
13 replies →
> We also just brute force "autocompletion" with our learned knowledge and combine it to new parts, which we then add to our learned knowledge to deepen the process
you know this because you're a cognitive scientist right? or because this is the consensus in the field?
>Neither can a toddler nor an animal. The level of ability is irrelevant for evaluating its foundation.
Its foundation of rational logical thought that can't process basic math? Even a toddler understands 2 is more than 1.
An LLM by itself is not thinking, just remembering and autocompleting. But if you add a feedback loop where it can use tools, investigate external files or processes, and then autocomplete on the results, you get to see something that is (close to) thinking. I've seen claude code debug things by adding print statements in the source and reasoning on the output, and then determining next steps. This feedback loop is what sets AI tools apart, they can all use the same LLM, but the quality of the feedback loop makes the difference.
>But if you add a feedback loop where it can use tools, investigate external files or processes, and then autocomplete on the results, you get to see something that is (close to) thinking
It's still just information retrieval. You're just dividing it into internal information (the compressed representation of the training data) and external information (web search, API calls to systems, etc). There is a lot of hidden knowledge embedded in language and LLMs do a good job of teasing it out that resembles reasoning/thinking but really isn't.
7 replies →
>>you get to see something that is (close to) thinking.
Isn't that still "not thinking"?
1 reply →
Just ask it how many r's are in strawberry and you will realize there isn't a lot of reasoning going on here, it's just trickery on top of token generators.
4 replies →
Furthermore regarding reasoning, just ask any LLM how many "r letters are in strawberry" - repeat maybe 3 times just to get a feeling for how much variance in answers you can get. And this "quirk" of the inability to get the right answer is something that after 2 years making fun of LLMs online on various forums is still an issue. The models aren't getting smarter, and definitely aren't thinking, they are still token generators with a few tricks on top to make them seem more intelligent than predecessors.
Sorry to say, Claude Sonnet 4.5 (with and without extended thinking) has this one solved now. So, not “any” LLM, but maybe some still need work.
I presume they finally got around to training the internal reasoning loop into predicting when it needs to apply answer predictions character-by-character.
—- Transcripts from Claude Sonnet 4.5, initially with extended thinking:
> How many a’s in the phrase “artichokes and aardvarks”? Be sure to check your answer.
> Let me count the letter ‘a’ in “artichokes and aardvarks”: • artichokes: 1 ‘a’ (at the beginning) • and: 1 ‘a’ (at the beginning) • aardvarks: 3 ‘a’s (two at the beginning: “aa”, and one in the middle: “ardvarks”) Total: 1 + 1 + 3 = 5 ‘a’s Let me verify by marking them: artichokes and aardvarks Yes, there are 5 letter ‘a’s in the phrase “artichokes and aardvarks”.
> Are you sure that’s correct?
> Let me recount more carefully: “artichokes and aardvarks” Going through each word letter by letter: 1. artichokes - has 1 ‘a’ (first letter) 2. and - has 1 ‘a’ (first letter) 3. aardvarks - has 3 ‘a’s (first letter, second letter, and one more in the middle) Total: 1 + 1 + 3 = 5 ‘a’s Yes, I’m confident the answer is 5. Is there something specific you’d like me to recheck?
> How many r’s in strawberry?
> Let me count the letter ‘r’ in “strawberry”: s-t-r-a-w-b-e-r-r-y Going through each letter: • s: not an r • t: not an r • r: 1st r • a: not an r • w: not an r • b: not an r • e: not an r • r: 2nd r • r: 3rd r • y: not an r There are 3 r’s in “strawberry”.
<Without extended thinking> > How many u’s in universalulis?
> Looking at “universalulis” letter by letter: u-n-i-v-e-r-s-a-l-u-l-i-s There are 2 u’s in “universalulis”.
For what it's worth, when I ask ChatGPT 5, it gets the correct answer every time. The response varies, but the answer is always three.
4 replies →
> it's just auto-completing. It cannot reason
Auto completion just means predicting the next thing in a sequence. This does not preclude reasoning.
> I don't get why you would say that.
Because I see them solve real debugging problems talking through the impact of code changes or lines all the time to find non-obvious errors with ordering and timing conditions on code they’ve never seen before.
> This does not preclude reasoning.
It does not imply it either. to claim reasoning you need evidence. it needs to reliably NOT hallucinate results for simple conversations for example (if it has basic reasoning).
> Because I see them solve real debugging problems talking through the impact of code changes or lines all the time to find non-obvious errors with ordering and timing conditions on code they’ve never seen before.
Programming languages and how programs work are extensively and abundantly documented, solutions to problems and how to approach them,etc.. have been documented on the internet extensively. It takes all of that data and it completes the right text by taking the most correct path way based on your input. it does not actually take your code and debug it. it is the sheer volume of data it uses and the computational resources behind it that are making it hard to wrap your head around the difference between guessing and understanding. You too can look at enough stack overflow and (poorly) guess answers for questions without understanding anything about the topic and if you guess enough you'll get some right. LLMs are just optimized to get the amount of correct responses to be high.
1 reply →
>I don't get why you would say that. it's just auto-completing. It cannot reason. It won't solve an original problem for which it has no prior context to "complete" an approximated solution with. you can give it more context and more data,but you're just helping it complete better. it does not derive an original state machine or algorithm to solve problems for which there are no obvious solutions. it instead approximates a guess (hallucination).
I bet you can't give an example such written problem that a human can easily solve but no LLM can.
> I don't get why you would say that.
Because it's hard to imagine the sheer volume of data it's been trained on.
And because ALL the marketing AND UX around LLMs is precisely trying to imply that they are thinking. It's not just the challenge of grasping the ridiculous amount of resources poured in, which does including training sets, it's because actual people are PAID to convince everybody those tools are actually thinking. The prompt is a chatbox, the "..." are there like a chat with a human, the "thinking" word is used, the "reasoning" word is used, "hallucination" is used, etc.
All marketing.
1 reply →
The vast majority of human “thinking” is autocompletion.
Any thinking that happens with words is fundamentally no different to what LLMs do, and everything you say applies to human lexical reasoning.
One plus one equals two. Do you have a concept of one-ness, or two-ness, beyond symbolic assignment? Does a cashier possess number theory? Or are these just syntactical stochastic rules?
I think the problem here is the definition of “thinking”.
You can point to non-verbal models, like vision models - but again, these aren’t hugely different from how we parse non-lexical information.
> Any thinking that happens with words is fundamentally no different from what LLMs do.
This is such a wildly simplified and naive claim. "Thinking with words" happens inside a brain, not inside a silicon circuit with artificial neurons bolted in place. The brain is plastic, it is never the same from one moment to the next. It does not require structured input, labeled data, or predefined objectives in order to learn "thinking with words." The brain performs continuous, unsupervised learning from chaotic sensory input to do what it does. Its complexity and efficiency are orders of magnitude beyond that of LLM inference. Current models barely scratch the surface of that level of complexity and efficiency.
> Do you have a concept of one-ness, or two-ness, beyond symbolic assignment?
Obviously we do. The human brain's idea of "one-ness" or "two-ness" is grounded in sensory experience — seeing one object, then two, and abstracting the difference. That grounding gives meaning to the symbol, something LLMs don't have.
8 replies →
We do a lot of autocompletion and LLMs overlap with that for sure. I don't know about the "vast majority" even basic operations like making sure we're breathing or have the right hormones prompted are not guesses but deterministic algorithmic ops. Things like object recognition and speech might qualify as autocompletion. But let's say you need to setup health-monitoring for an application. that's not an autocomplete operation. you must evaluate various options, have opinions on it, weigh priorities,etc.. in other words, we do autocompletion but even then the autocompletion is a basic building block or tool we use in constructing more complex decision logic.
If you train an animal to type the right keys on a keyboard that generates a hello world program, you didn't just teach them how to code. they just memorized the right keys that lead to their reward. a human programmer understands the components of the code, the intent and expectations behind it, and can reason about how changes would affect outcomes. the animal just knows how the reward can be obtained most reliably.
Sure. But neither do you. So are you really thinking or are you just autocompleting?
When was the last time you sat down and solved an original problem for which you had no prior context to "complete" an approximated solution with? When has that ever happened in human history? All the great invention-moment stories that come to mind seem to have exactly that going on in the background: Prior context being auto-completed in an Eureka! moment.
I think (hah) you're understimating what goes on when living things (even small animals) think. We use auto-compleition for some tasks, but it is a component of what we do.
Let's say your visual system auto-completes some pattern and detects a snake while you're walking, that part is auto-completion. You will probably react by freezing or panicing, that part is not auto-compleition, it is a deterministic algorithm. But then you process the detected object, auto-compleiting again to identify it as just a long cucumber. But again, the classification part is auto-completion. What will you do next? "Hmm, free cucumber, i can cook with it for a meal" and you pick it up. auto-completion is all over that simple decision, but you're using results of auto-completion to derive association (food), check your hunger level (not auto-completion), determine that the food is desirable and safe to eat (some auto-compleition), evalute what other options you have for food (evaluate auto-complete outputs), and then instruct your nervous system to pick it up.
We use auto-compleition all the time as an input, we don't reason using auto-compleition in other words. You can argue that if all your input is from auto-completion (it isn't) then it makes no difference. But we have deterministic reasoning logical systems that evaluate auto-completion outputs. if your cucumber detection identified it as rotten cucumber, then decision that it is not safe to eat is not done by auto-completion but a reasoning logic that is using auto-completion output. You can approximate the level of rot but once you recognize it as rotten, you make decision based on that information. You're not approximating a decision, you're evaluating a simple logic of: if(safe()){eat();}.
Now amp that up to solving very complex problems. what experiments will you run, what theories will you develop, what R&D is required for a solution,etc.. these too are not auto-completions. an LLM would auto complete these and might arrive at the same conclusion most of the time. but our brains are following algorithms we developed and learned over time where an LLM is just expanding on auto-completion but with a lot more data. In contrast, our brains are not trained on all the knowledge available on the public internet, we retain a tiny miniscule of that. we can arrive at similar conclusions as the LLM because we are reasoning and following algorithms matured and perfected over time.
The big take away should be that, as powerful as LLMs are now, if they could reason like we do, they'd dominate us and become unstoppable. Because their auto-completion is many magnitudes better than ours, if they can write new and original code based on an understanding of problem solving algorithms, that would be gen ai.
We can not just add 1 + 1 but prove that the addition operation is correct mathematically. and understand that when you add to a set one more object, the addition operation always increments. We don't approximate that, we always, every single time , increment because we are following an algorithm instead of choosing the most likely correct answer.
> don't get why you would say that. it's just auto-completing.
https://en.wikipedia.org/wiki/Predictive_coding
> If it thought and reasoned, it would know to ignore those and focus on the variables and input at hand to produce deterministic and consistent output
You only do this because you were trained to do this, eg. to see symmetries and translations.
You wrote your comment one word at a time, with the next word depending on the previous words written.
You did not plan the entire thing, every word, ahead of time.
LLMs do the same thing, so... how is your intelligence any different?
A long time ago I noticed that I sometimes already had a complete thought before my inner monologue turned it into words. A few times I tried skipping the inner monologue because I'd clearly already thought the thought. Turns out the bit of my brain that creates the inner monologue from the thought, can generate a sense of annoyance that the rest of my brain can feel.
Not that it matters, there's evidence that while LLMs output one word at a time, they've got forward-planning going on, having an idea of the end of a sentence before they get there.
1 reply →
Tell that to German-speakers, where the verb comes last, and the order of things in sentences is not anything like English, therefore requiring you to think of the entire sentence before you just spit it out. Even the numbers are backwards (twenty-two is two-and-twenty) which requires thinking.
Furthermore, when you ask an LLM to count how many r's are in the word strawberry, it will give you a random answer, "think" about it, and give you another random answer. And I guarantee you out of 3 attempts, including reasoning, it will flip-flop between right and wrong, but unlike a human, it will be random, because, unlike humans who, when asked "how many r's are in the word strawberry" will not be able to tell you the correct answer every. fucking. time.
edit: formatting
3 replies →
Sometimes after a night’s sleep, we wake up with an insight on a topic or a solution to a problem we encountered the day before. Did we “think” in our sleep to come up with the insight or solution? For all we know, it’s an unconscious process. Would we call it “thinking”?
The term “thinking” is rather ill-defined, too bound to how we perceive our own wakeful thinking.
When conversing with LLMs, I never get the feeling that they have a solid grasp on the conversation. When you dig into topics, there is always a little too much vagueness, a slight but clear lack of coherence, continuity and awareness, a prevalence of cookie-cutter verbiage. It feels like a mind that isn’t fully “there” — and maybe not at all.
I would agree that LLMs reason (well, the reasoning models). But “thinking”? I don’t know. There is something missing.
> Sometimes after a night’s sleep, we wake up with an insight on a topic or a solution to a problem we encountered the day before.
The current crop of models do not "sleep" in any way. The associated limitations on long term task adaptation are obvious barriers to their general utility.
> When conversing with LLMs, I never get the feeling that they have a solid grasp on the conversation. When you dig into topics, there is always a little too much vagueness, a slight but clear lack of coherence, continuity and awareness, a prevalence of cookie-cutter verbiage. It feels like a mind that isn’t fully “there” — and maybe not at all.
One of the key functions of REM sleep seems to be the ability to generalize concepts and make connections between "distant" ideas in latent space [1].
I would argue that the current crop of LLMs are overfit on recall ability, particularly on their training corpus. The inherent trade-off is that they are underfit on "conceptual" intelligence. The ability to make connections between these ideas.
As a result, you often get "thinking shaped objects", to paraphrase Janelle Shane [2]. It does feel like the primordial ooze of intelligence, but it is clear we still have several transformer-shaped breakthroughs before actual (human comparable) intelligence.
1. https://en.wikipedia.org/wiki/Why_We_Sleep 2. https://www.aiweirdness.com/
There is simply put no ongoing process and no feedback loop. The model does not learn. The cognition ends when the inference cycle ends. It's not thinking, it just produces output that looks similar to the output of thinking. But the process by which it does that is wholly unreleated.
Humans with certain amnestic syndromes are incapable of learning. That doesn't make them unintelligent or incapable of thought.
Most AI tooling is shipped with a feedback loop around the LLM. The quality of Claude Code for example lies in the feedback loop it provides on your code. Maybe the LLM itself isn't thinking, but the Agent which ships an LLM plus feedback loop definitely shows thinking qualities.
Just now in an debugging session with claude code:
How is this not thinking?
8 replies →
You're moving the goalposts and contradicting yourself with language games.
Something doesn't need to learn to think. I think all the time without learning.
There's also an argument for machines already starting to crack learning with literal reinforcement training and feedback loops.
Your language game was when you said the 'cognition ends...', as cognition is just a synonym for thinking. "The thinking ends when the inference cycle ends. It's not thinking'" becomes a clear contradiction.
As for "the process by which it does that is wholly unrelated", buddy it's modelled on human neuron behaviour. That's how we've had this generative AI breakthrough. We've replicated human mental cognition as closely as we can with current technology and the output bears striking resemblance to our own generative capabilities (thoughts).
Happy to admit it's not identical, but it's damn well inside the definition of thinking, may also cover learning. It may be better to take a second look at human thinking and wonder if it's as cryptic and deep as we thought ten, twenty years ago.
Interesting, you think the associations your brain comes up with during sleep are NOT thinking?
I’d say it’s more like annealing. Sort, filter, compress, integrate.
Updates your models for the next morning, which is why the answer is there when it wasn’t before.
I assume most people agree with that framing. “Thinking” generally implies a level of consciousness or self-awareness.
“Let me think about this.” “I have to think on it.”
My brain regulates all sorts of processes unconsciously, like breathing, for example. I don’t treat those as “thinking,” so I don’t know why other unconscious brain activity would be either.
1 reply →
Perhaps this is an artefact of instantiation - when you talk with an LLM, the responding instance is just that - it comes into being, inhales your entire chat history, and then continues like the last chap, finishes its response, and dies.
The continuity is currently an illusion.
Do LLMs ever ask for you to clarify something you said in a way a person who doesn't quite understand what you said will do?
Sometimes I think people leveraging criticisms of LLMs used ChatGPT 3 years ago and haven't touched one since, except for asking how many r's are in strawberry a year and a half ago.
Yeah, as someone who has gained a lot of interaction skills by playing with the constructivist learning ennvironment called the enneagram, I can attest that it much resembles behaviour characteristic of certain enneatypes.
Yes, often
What now, two minutes using one and you are going to get that!
> When conversing with LLMs, I never get the feeling that they have a solid grasp on the conversation. When you dig into topics, there is always a little too much vagueness, a slight but clear lack of coherence, continuity and awareness, a prevalence of cookie-cutter verbiage. It feels like a mind that isn’t fully “there” — and maybe not at all.
Much like speaking to a less experienced colleague, no?
They say things that contain the right ideas, but arrange it unconvincingly. Still useful to have though.
> Would we call it “thinking”?
Yes I would.
Having seen photocopiers so many times produce coherent, sensible, and valid chains of words on a page, I am at this point in absolutely no doubt that they are thinking.
Photocopiers are the opposite of thinking. What goes in, goes out, no transformation or creating of new data at all. Any change is just an accident, or an artifact of the technical process.
That's not actually true - try photocopying a banknote with the security pattern on it.
4 replies →
So, if there's a flaw in its sensor and you get somewhat warped output, would you consider it thinking then?
2 replies →
It is the wrong category error.
The proper category error in the context of the discussion would be to say the photocopier is drawing a picture.
It doesn't matter how well or not the photocopier recreates an image. To say the photocopier is drawing a picture is just nonsense and has no meaning.
The same category error as to say the LLM is "thinking".
Of course, the category error could be well exploited for marketing purposes if you are in the business of selling photocopiers or language models.
https://www.theregister.com/2013/08/06/xerox_copier_flaw_mea...
That's not a flaw. That model's creativity tuned a bit too high. It's a happy little copier which can be a little creative and unconventional with reasoning, at times.
I saw Dr. Abuse producing coherent, sensible and valid chains of words, running on a 386.
I’ve seen so many humans bring stupid. Definitively there is nothing in the brain.
You see how doesn’t make sense what you saying?
easily top 10 best HN comment ever.
[dead]
Having seen LLMs so many times produce incoherent, nonsensical and invalid chains of reasoning...
LLMs are little more than RNGs. They are the tea leaves and you read whatever you want into them.
They are clearly getting to useful and meaningful results with at a rate significantly better than chance (for example, the fact that ChatGPT can play chess well even though it sometimes tries to make illegal moves shows that there is a lot more happening there than just picking moves uniformly at random). Demanding perfection here seems to be odd given that humans also can make bizarre errors in reasoning (of course, generally at a lower rate and in a distribution of kinds of errors we are more used to dealing with).
The fact that a model trained on the internet, on which the correct rules of chess are written, is unable to determine what is and is not a legal move, seems like a sign that these models are not reasoning about the questions asked of them. They are just giving responses that look like (and often are) correct chess moves.
1 reply →
Ridiculous. I use it daily and get meaningful, quality results. Learn to use the tools.
> Learn to use the tools.
Thing is, you wouldn't need to learn to use the tools if the tool was able to think. A thinking entity is able to adapt to other parties who lack learnings. This confirms that LLMs are little more than fancy RNGs.
> I use it daily and get meaningful, quality results.
That's what the tea leaf readers say too, funnily enough.
2 replies →
Learn to work on interesting problems? If the problem you are working on is novel and hard, the AI will stumble.
Generalizing your experience to everyone else's betrays a lack of imagination.
26 replies →
They are only meaningful and quality if you don’t know what you’re doing. But please do show some of this meaningful and quality work so I can be proven wrong.
7 replies →
That's a skill issue on your end
The first principle is that you must not fool yourself, and you are the easiest person to fool. - Richard P. Feynman
They're not thinking, we're just really good at seeing patterns and reading into things. Remember, we never evolved with non-living things that could "talk", we're not psychologically prepared for this level of mimicry yet. We're still at the stage of Photography when people didn't know about double exposures or forced perspective, etc.
You're just assuming that mimicry of a thing is not equivalent to the thing itself. This isn't true of physical systems (simulated water doesn't get you wet!) but it is true of information systems (simulated intelligence is intelligence!).
> You're just assuming that mimicry of a thing is not equivalent to the thing itself.
I'm not assuming that, that's literally the definition of mimicry: to imitate closely.
You might say I'm assuming that it is mimicking and not actually thinking, but there's no evidence it's actually thinking, and we know exactly what is IS doing because we created the code that we used to build the model. They're not thinking, it's doing math, mathematical transformations of data.
5 replies →
But a simulated mind is not a mind. This was already debated years ago with the aid of the Chinese Room thought experiment.
9 replies →
yeah it’s just processing, calling it thinking is the same as saying my intel core 2 duo or M4 Pro is thinking, sure if you want to anthropomorphize it you could say it’s thinking, but why are we trying to say a computer is a person in the first place? seems kind of forced
Yes, I've seen the same things.
But; they don't learn. You can add stuff to their context, but they never get better at doing things, don't really understand feedback. An LLM given a task a thousand times will produce similar results a thousand times; it won't get better at it, or even quicker at it.
And you can't ask them to explain their thinking. If they are thinking, and I agree they might, they don't have any awareness of that process (like we do).
I think if we crack both of those then we'd be a lot closer to something I can recognise as actually thinking.
> But; they don't learn
If we took your brain and perfectly digitized it on read-only hardware, would you expect to still “think”?
Do amnesiacs who are incapable of laying down long-term memories not think?
I personally believe that memory formation and learning are one of the biggest cruces for general intelligence, but I can easily imagine thinking occurring without memory. (Yes, this is potentially ethically very worrying.)
>If we took your brain and perfectly digitized it on read-only hardware, would you expect to still “think”?
it wouldn't work probably, brains constantly alter themselves by forming new connections. Learning is inseparable from our intelligence.
6 replies →
> If we took your brain and perfectly digitized it on read-only hardware, would you expect to still “think”?
Perhaps this is already known, but I would think there is a high chance that our brains require "write access" to function. That is, the very process of neural activity inherently makes modifications to the underlying structure.
4 replies →
> You can add stuff to their context, but they never get better at doing things, don't really understand feedback.
I was using Claude Code today and it was absolutely capable of taking feedback to change behavior?
Within a session, yes, it will add each input to the context.
Start a new session with a blank context and it will not have learned anything.
2 replies →
This is just wrong though. They absolutely learn in-context in a single conversation within context limits. And they absolutely can explain their thinking; companies just block them from doing it.
> Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.
While I'm not willing to rule *out* the idea that they're "thinking" (nor "conscious" etc.), the obvious counter-argument here is all the records we have of humans doing thinking, where the records themselves are not doing the thinking that went into creating those records.
And I'm saying this as someone whose cached response to "it's just matrix multiplication it can't think/be conscious/be intelligent" is that, so far as we can measure all of reality, everything in the universe including ourselves can be expressed as matrix multiplication.
Falsification, not verification. What would be measurably different if the null hypothesis was wrong?
I've definitely had AIs thinking and producing good answers about specific things that have definitely not been asked before on the internet. I think the stochastic parrot argument is well and truly dead by now.
I've also experienced this, to an extent, but on qualitative topics the goodness of an answer - beyond basic requirements like being parseable and then plausible - is difficult to evaluate.
They can certainly produce good-sounding answers, but as to the goodness of the advice they contain, YMMV.
1 reply →
how do you definitely know that?
6 replies →
I think you are the one dismissing evidence. The valid chains of reasoning you speak of (assuming you are talking about text you see in a “thinking model” as it is preparing its answer) are narratives, not the actual reasoning that leads to the answer you get.
I don’t know what LLMs are doing, but only a little experimentation with getting it to describe its own process shows that it CAN’T describe its own process.
You can call what a TI calculator does “thinking” if you want. But what people are interested in is human-like thinking. We have no reason to believe that the “thinking” of LLMs is human-like.
> The valid chains of reasoning you speak of (assuming you are talking about text you see in a “thinking model” as it is preparing its answer) are narratives, not the actual reasoning that leads to the answer you get.
It's funny that you think people don't also do that. We even have a term, "post hoc rationalization", and theories of mind suggest that our conscious control is a complete illusion, we just construct stories for decisions our subconscious has already made.
Counterpoint: The seahorse emoji. The output repeats the same simple pattern of giving a bad result and correcting it with another bad result until it runs out of attempts. There is no reasoning, no diagnosis, just the same error over and over again within a single session.
A system having terminal failure modes doesn't inherently negate the rest of the system. Human intelligences fall prey to plenty of similarly bad behaviours like addiction.
I never met an addicted person that could be reduced to a simple while(true) print("fail") loop.
You never had that coleague that says yes to everything and can’t get anything done? Same thing as seahorse.
Isn’t anthropomorphizing LLMs rather than understanding their unique presence in the world a “ lack of imagination and flexibility of thought”? It’s not that I can’t imagine applying the concept “thinking” to the output on the screen, I just don’t think it’s an accurate description.
Yes, it's an example of domain-specific thinking. "The tool helps me write code, and my job is hard so I believe this tool is a genius!"
The Roomba vacuumed the room. Maybe it vacuumed the whole apartment. This is good and useful. Let us not diminish the value of the tool. But it's a tool.
The tool may have other features, such as being self-documenting/self-announcing. Maybe it will frighten the cats less. This is also good and useful. But it's a tool.
Humans are credulous. A tool is not a human. Meaningful thinking and ideation is not just "a series of steps" that I will declaim as I go merrily thinking. There is not just a vast training set ("Reality"), but also our complex adaptability that enables us to test our hypotheses.
We should consider what it is in human ideation that leads people to claim that a Roomba, a chess programme, Weizenbaum's Eliza script, the IBM's Jeopardy system Watson, or an LLM trained on human-vetted data is thinking.
Train such a system on the erroneous statements of a madman and suddenly the Roomba, Eliza, IBM Watson (and these other systems) lose our confidence.
As it is today, the confidence we have in these systems is very conditional. It doesn't matter terribly if code is wrong... until it does.
Computers are not humans. Computers can do things that humans cannot do. Computers can do these things fast and consistently. But fundamentally, algorithms are tools.
I guess it depends if you definite thinking thinking as chaining coherent reasoning sentences together 90-some% of the time.
But if you define thinking as the mechanism and process we mentally undergo and follow mentally... I don't think we have any clue if that's the same. Do we also just vector-map attention tokens and predict the next with a softmax? I doubt, and I don't think we have any proof that we do.
We do know at the biochemical level how neurons work, and it isnt anything like huge matmuls.
It might appear so, but then you could validate it with a simple test. If the LLM would play a 4x4 Tic Tac Toe game, would the agent select the winning move 100% of all time or block a losing move 100% of the time? If these systems were capable of proper reasoning, then they would find the right choice in these obvious but constantly changing scenarios without being specifically trained for it.
[1] https://jdsemrau.substack.com/p/nemotron-vs-qwen-game-theory...
If you understand how they operate and you are reasonable and unbiased there is no way you could consider it thinking
Different PoV: You have a local bug and ask the digital hive mind for a solution, but someone already solved the issue and their solution was incorporated... LLMs are just very effficient at compressing billions of solutions into a few GB.
Try to ask something no one ever came up with a solution so far.
This argument comes up often but can be easily dismissed. Make up a language and explain it to the LLM like you would to a person. Tell it to only use that language now to communicate. Even earlier AI was really good at this. You will probably move the goal posts and say that this is just pattern recognition, but it still fits nicely within your request for something that no one ever came up with.
I haven't tried in a while but at least previously you could completely flummox Gemini by asking it to come up with some plausible English words with no real known meaning; it just kept giving me rare and funny-sounding actual words and then eventually told me the task is impossible.
2 replies →
Ask ChatGPT about ConLang. It knows. Inventing languages was solved a hundred years ago with Esperanto.
> but it still fits nicely within your request for something that no one ever came up with
YOU came up with it dawg
Yeah but if I assign it a long job to process I would also say that an x86 CPU is "thinking" about a problem for me.
What we really mean in both cases is "computing," no?
But all those times the same system produces irrational gibberish don't count? GPT-5 will commonly make mistakes no thinking human could ever make.
Human: I'm trying to get my wolf, sheep and cabbage across the river in this boat, but the wolf keeps eating the sheep or the sheep eats the cabbage
Bot: You should put the sheep in the boat and take it across — if we delve into the biology of Canis lupus we discover that wolves don't eat cabbage!
H: Ok, so that worked great so far, the sheep is on one side and the wolf/cabbage is on the other.
B: Now, Option 1 is to bring the wolf across, or Option 2 you can bring the cabbage. I recommend (2) taking the cabbage as cabbages are smaller and easier to transport in a boat.
H: But then the sheep eats the cabbage, right? Remember that?
B: Exactly, that's sharp thinking. If you put the sheep and the cabbage together on the same side of the river, the sheep is sure to devour the cabbage. We need to not just separate sheep from cabbages — we need to separate cabbages from sheep! :rocketship:
Having seen LLMs so many time produce incoherent, nonsense, invalid answers to even simplest of questions I cannot agree with categorization of "thinking" or "intelligence" that applies to these models. LLMs do not understand what they "know" or what they output. All they "know" is that based on training data this is most likely what they should output + some intentional randomization to make it seem more "human like". This also makes it seem like they create new and previously unseen outputs but that could be achieved with simple dictionary and random number generator and no-one would call that thinking or intelligent as it is obvious that it isn't. LLMs are better at obfuscating this fact by producing more sensible output than just random words. LLMs can still be useful but they are a dead-end as far as "true" AI goes. They can and will get better but they will never be intelligent or think in the sense that most humans would agree those terms apply. Some other form of hardware/software combination might get closer to AI or even achieve full AI and even sentience but that will not happen with LLMs and current hardware and software.
what sound does a falling tree make if no one is listening?
I’ve asked LLMs to write code for me in fields I have little background knowledge, and then had to debug the whole thing after essentially having to learn the language and field.
On the other hand, for things I am well versed in, I can debug the output and avoid entire swathes of failed states, by having a clear prompt.
Its why I now insist that any discussion on GenAI projects also have the speaker mention the level of seniority they have ( proxy for S/W eng experience), Their familiarity with the language, the project itself (level of complexity) - more so than the output.
I also guarantee - that most people have VERY weak express knowledge of how their brains actually work, but deep inherent reflexes and intuitions.
I'd represent the same idea but in a different way:
I don't know what the exact definition of "thinking" is. But if a definition of thinking rejects the possibility of that current LLMs think, I'd consider that definition useless.
Why would it be useless?
Generally thinking has been used to describe the process human follow in their brains when problem solving.
If the Palms do not follow that process, they are not thinking.
That doesn't mean they cannot solve problems using other mechanisms, they do, and we understand those mechanisms much better than we do human thinking.
>Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.
If one could write a quadrillion-line python script of nothing but if/elif/else statements nested 1 million blocks deep that seemingly parsed your questions and produced seemingly coherent, sensible, valid "chains of reasoning"... would that software be thinking?
And if you don't like the answer, how is the LLM fundamentally different from the software I describe?
>Knee jerk dismissing the evidence in front of your eyes because
There is no evidence here. On the very remote possibility that LLMs are at some level doing what humans are doing, I would then feel really pathetic that humans are as non-sapient as the LLMs. The same way that there is a hole in your vision because of a defective retina, there is a hole in your cognition that blinds you to how cognition works. Because of this, you and all the other humans are stumbling around in the dark, trying to invent intelligence by accident, rather than just introspecting and writing it out from scratch. While our species might someday eventually brute force AGI, it would be many thousands of years before we get there.
I write software that is far less complex and I consider it to be "thinking" while it is working through multiple possible permutations of output and selecting the best one. Unless you rigidly define thinking, processing, computing, it's reasonable to use them interchangeably.
To borrow a line from Dijkstra, the claim seems a bit like saying that a submarine is swimming.
16 replies →
10^15 lines of code is a lot. We would pretty quickly enter the realm of it not having much to do with programming and more about just treating the LOC count as an amount of memory allocated to do X.
How much resemblance does the information in the conditionals need to have with the actual input, or can they immediately be transformed to a completely separate 'language' which simply uses the string object as its conduit? Can the 10^15 lines of code be generated with an external algorithm, or is it assumed that I'd written it by hand given an infinitely long lifespan?
I think we can call it "thinking" but it's dangerous to anthropomorphize LLMs. The media and AI companies have an agenda when doing so.
They may not be "thinking" in the way you and I think, and instead just finding the correct output from a really incredibly large search space.
> Knee jerk dismissing the evidence in front of your eyes
Anthropomorphizing isn't any better.
That also dismisses the negative evidence, where they output completely _stupid_ things and make mind boggling mistakes that no human with a functioning brain would do. It's clear that there's some "thinking" analog, but there are pieces missing.
I like to say that LLMs are like if we took the part of our brain responsible for language and told it to solve complex problems, without all the other brain parts, no neocortex, etc. Maybe it can do that, but it's just as likely that it is going to produce a bunch of nonsense. And it won't be able to tell those apart without the other brain areas to cross check.
It's reinforcement learning applied to text, at a huge scale. So I'd still say that they are not thinking, but they are still useful. The question of the century IMO is if RL can magically solve all our issues when scaled enough.
>Knee jerk dismissing the evidence in front of your eyes because you find it unbelievable that we can achieve true reasoning via scaled matrix multiplication is understandable, but also betrays a lack of imagination and flexibility of thought.
You go ahead with your imagination. To us unimaginative folks, it betrays a lack of understanding of how LLMs actually work and shows that a lot of people still cannot grasp that it’s actually an extremely elaborate illusion of thinking.
"Convince" the stock Claude Sonnet 4.5 that it's a sentient human being hooked up to Neuralink and then tell me again it's thinking. It's just not.
Code gen is the absolute best case scenario for LLMs though: highly structured language, loads of training data, the ability to automatically error check the responses, etc. If they could mimic reasoning anywhere it would be on this problem.
I'm still not convinced they're thinking though because they faceplant on all sorts of other things that should be easy for something that is able to think.
"Consciousness" as in subjective experience, whatever it is we mean by "the hard problem," is very much in doubt.
But "self-awareness," as in the ability to explicitly describe implicit, inner cognitive processes? That has some very strong evidence for it: https://www.anthropic.com/research/introspection
Then the only thing I have to ask you is: what do you think this means in terms of how we treat LLMs? If they think, that is, they have cognition (which of course means they're self aware and sentient, how can you think and refer to yourself and not be these things), that puts them in a very exclusive club. What rights do you think we should be affording LLMs?
Thinking as in capable of using basic reasoning and forming chains of logic and action sequences for sure. Ofc we both understand that neither of us are trying to say we think it can think in the human sense at this point in time.
But oh boy have I also seen models come up with stupendously dumb and funny shit as well.
Apparent reasoning can emerge from probabilistic systems that simply reproduce statistical order not genuine understanding.
Weather models sometimes “predict” a real pattern by chance, yet we don’t call the atmosphere intelligent.
If LLMs were truly thinking, we could enroll one at MIT and expect it to graduate, not just autocomplete its way through the syllabus or we could teach one how to drive.
> Having seen LLMs so many times produce coherent, sensible and valid chains of reasoning to diagnose issues and bugs in software I work on, I am at this point in absolutely no doubt that they are thinking.
People said the same thing about ELIZA
> Consciousness or self awareness is of course a different question,
Then how do you define thinking if not a process that requires consciousness?
Why would it require consciousness, when we can't even settle on a definition for that?
They remind me of the apparitions in Solaris. They have this like mechanical, almost player-piano like quality to them. They both connect with and echo us at the same time. It seems crazy to me and very intellectually uncreative to not think of this as intelligence.
> The world is full of bizarre wonders and this is just one more to add to the list.
https://youtu.be/_-agl0pOQfs?si=Xiyf0InqtjND9BnF
Is this model thinking too?
https://huggingface.co/PantheonUnbound/Satyr-V0.1-4B
If AI is thinking if slavery is bad then how can somebody own AI. How can investors can shares from AI profits? We are ok with slavery now. Ok i will have two black slaves now. Who can ask me? Why shld that be illegal?
Yikes, you're bypassing thousands of years of oppression, abuse, and human suffering by casually equating a term that is primarily associated with a human owning another human to a different context.
There is a way to discuss if keeping intelligent artificial life under servitude without using those terms, especially if you're on a new account.
slavery is slavery does not mean it is AI or human. if slavery is ok then the question is who can own a slave. the answer is coporates like open ai. which is terrible for humanity and the universe
I presume you are aware that the word "robot" is taken from a Czech word (robota) meaning "slave"
Too many people place their identity in their own thoughts/intellect. Acknowledging what the LLMs are doing as thought would basically be calling them human to people of that perspective.
Sometimes I start thinking our brains work the same way as an LLM does when it comes to language processing. Are we just using probability based on what we already know and the context of the statement we're making to select the next few words? Maybe we apply a few more rules than an LLM on what comes next as we go.
We train ourselves on content. We give more weight to some content than others. While listening to someone speak, we can often predict their next words.
What is thinking without language? Without language are we just bags of meat reacting to instincts and emotions? Are instincts and emotions what's missing for AGI?
I agree with you.
If you took a Claude session into a time machine to 2019 and called it "rent a programmer buddy," how many people would assume it was a human? The only hint that it wasn't a human programmer would be things where it was clearly better: it types things very fast, and seems to know every language.
You can set expectations in the way you would with a real programmer: "I have this script, it runs like this, please fix it so it does so and so". You can do this without being very precise in your explanation (though it helps) and you can make typos, yet it will still work. You can see it literally doing what you would do yourself: running the program, reading the errors, editing the program, and repeating.
People need to keep in mind two things when they compare LLMs to humans: you don't know the internal process of a human either, he is also just telling you that he ran the program, read the errors, and edited. The other thing is the bar for thinking: a four-year old kid who is incapable of any of these things you would not deny as a thinking person.
> If you took a Claude session into a time machine to 2019 and called it "rent a programmer buddy," how many people would assume it was a human?
Depends on the users. Junior devs might be fooled. Senior devs would quickly understand that something is wrong.
Having seen parrots so many times produce coherent, sensible, and valid chains of sounds and words, I am at this point in absolutely no doubt that they are thinking.
You think parrots don't think?
Instead of thinking, "Wow. AIs are smart like humans", maybe we should say, "Humans are dumb like matrix multiplication?"
If you're sensitive to patterns and have been chronically online for the last few decades it's obvious they are not thinking.
Would they have diagnosed an issue if you hadn't presented it to them?
Life solves problems itself poses or collides with. Tools solve problems only when applied.
You’re assuming the issues and bugs you’ve been addressing don’t already exist, already encoding human chain of reasoning, in the training data.
Its overt or unaware religion. The point when you come down to the base of it is that these people believe in "souls".
I'm not so sure. I, for one, do not think purely by talking to myself. I do that sometimes, but a lot of the time when I am working through something, I have many more dimensions to my thought than inner speech.
So an x86 CPU is thinking?
So many times I've seen it produce sensible, valid chains of results.
Yes, I see evidence in that outcome that a person somewhere thought and understood. I even sometimes say that a computer is "thinking hard" about something when it freezes up.
...but ascribing new philosophical meaning to this simple usage of the word "thinking" is a step too far. It's not even a new way of using the word!
You can't say for sure it is or it isn't thinking based solely on the substrate, because it's not known for sure if consciousness is dependent on the hardware it's running on -- for a lack of a better analogy -- to manifest, if it really needs an organic brain or if it could manifest in silicon based solutions.
I agree. I'm just pointing out that the meaning of the word "think" already applied to the silicon substrate pre-ai, so just saying it's still applicable isn't that compelling.
But yeah, I am fully willing to believe that a silicon based life form could think and be alive. i just don't think we're there. Yes this thing speaks using a passable imitation of the voices of PhDs and poets, but in a way a simulated annelid is more alive.
[dead]
[dead]
[dead]
[dead]