← Back to context

Comment by andrewmcwatters

2 months ago

No, it's the entire architecture of the model. There's no real reasoning. It seems that reasoning is just a feedback loop on top of existing autocompletion.

It's really disingenuous for the industry to call warming tokens for output, "reasoning," as if some autocomplete before more autocomplete is all we needed to solve the issue of consciousness.

Edit: Letter frequency apparently has just become another scripted output, like doing arithmetic. LLMs don't have the ability to do this sort of work inherently, so they're trained to offload the task.

Edit: This comment appears to be wildly upvoted and downvoted. If you have anything to add besides reactionary voting, please contribute to the discussion.

In ten years time an LLM lawyer will lose a legal case for someone who can no longer afford a real lawyer because there are so few left. And it'll be because the layers of bodges in the model caused it to go crazy, insult the judge and threaten to burn down the courthouse.

There will be a series of analytical articles in the mainstream press, the tech industry will write it off as a known problem with tokenisation that they can't fix because nobody really writes code anymore.

The LLM megacorp will just add a disclaimer: the software should not be used in legal actions concerning fruit companies and they disclaim all losses.

  • I glumly predict LLMs will end up a bit like asbestos: Powerful in some circumstances, but over/mis-used, hurting people in a way that will be difficult to fix later.

> Edit: Letter frequency apparently has just become another scripted output, like doing arithmetic. LLMs don't have the ability to do this sort of work inherently, so they're trained to offload the task.

Mechanistic research at the leading labs has shown that LLMs actually do math in token form up to certain scale of difficulty.

> This is a real-time, unedited research walkthrough investigating how GPT-J (a 6 billion parameter LLM) can do addition.

https://youtu.be/OI1we2bUseI

Please define “real reasoning”? Where is the distinction coming from?

  • Can we not downvote this, please? It's a good question.

    There's prior art for formal logic and knowledge representation systems dating back several decades, but transformers don't use those designs. A transformer is more like a search algorithm by comparison, not a logic one.

    That's one issue, but the other is that reasoning comes from logic, and the act of reasoning is considered a qualifier of consciousness. But various definitions of consciousness require awareness, which large language models are not capable of.

    Their window of awareness, if you can call it that, begins and ends during processing tokens, and outputting them. As if a conscious thing could be conscious for moments, then dormant again.

    That is to say, conscious reasoning comes from awareness. But in tech, severing the humanities here would allow one to suggest that one, or a thing, can reason without consciousness.

    • There is no model of conscience or reasoning.

      The hard truth is we have no idea. None. We got ideas and conjectures, maybe's and probably's, overconfident researchers writing books while hand waving away obvious holes, and endless self introspective monologues.

      Don't waste your time here if you know what reasoning and consciousness are, go get your nobel prize.

  • Athenian wisdom suggests that fallacious thought is "unreasonable". So reason is the opposite of that.

I had a fun experience recently. I asked one of my daughters how many r's there are in strawberry. Her answer? Two ...

Of course then you ask her to write it and of course things get fixed. But strange.

  • I think that's supposed to be the idea of reasoning functionality, but in practice, it just seems to allow responses to continue longer than that would have otherwise by bisecting the output into warming an output and then using maybe what we would consider cached tokens to assist with further contextual lookups.

    That is to say, you can obtain the same process by talking to "non-reasoning" models.

  • To be honest, if a kid asked me how many r's in strawberry, I would assume they were asking how many r's at the end and say 2.

  • I hate to break it to you but I think your child might actually have gotten swapped in the hospital with an LLM.

> There's no real reasoning. It seems that reasoning is just a feedback loop on top of existing autocompletion.

I like to say that if regular LLM "chats" are actually movie scripts being incrementally built and selectively acted-out, then "reasoning" models are a stereotypical film noir twist, where the protagonist-detective narrates hidden things to himself.

> No, it's the entire architecture of the model.

Wrong, it's an artifact of tokenizing. The model doesn't have access to the individual letters, only to the tokens. Reasoning models can usually do this task well - they can spell out the word in the reasoning buffer - the fact that GPT5 fails here is likely a result of it incorrectly answering the question with a non-reasoning version of the model.

> There's no real reasoning.

This seems like a meaningless statement unless you give a clear definition of "real" reasoning as opposed to other kinds of reasoning that are only apparant.

> It seems that reasoning is just a feedback loop on top of existing autocompletion.

The word "just" is doing a lot of work here - what exactly is your criticism here? The bitter lesson of the past years is that relatively simple architectures that scale with compute work surprisingly well.

> It's really disingenuous for the industry to call warming tokens for output, "reasoning," as if some autocomplete before more autocomplete is all we needed to solve the issue of consciousness.

Reasoning and consciousness are seperate concepts. If I showed the output of an LLM 'reasoning' (you can call it something else if you like) to somebody 10 years ago they would agree without any doubt that reasoning was taking place there. You are free to provide a definition of reasoning which an LLM does not meet of course - but it is not enough to just say it is so. Using the word autocomplete is rather meaningless name-calling.

> Edit: Letter frequency apparently has just become another scripted output, like doing arithmetic. LLMs don't have the ability to do this sort of work inherently, so they're trained to offload the task.

Not sure why this is bad. The implicit assumption seems to be that an LLM is only valueable if it literally does everything perfectly?

> Edit: This comment appears to be wildly upvoted and downvoted. If you have anything to add besides reactionary voting, please contribute to the discussion.

Probably because of the wild assertions, charged language, and rather superficial descriptions of actual mechanics.

  • These aren't wild assertions. I'm not using charged language.

    > Reasoning and consciousness are seperate(sic) concepts

    No, they're not. But, in tech, we seem to have a culture of severing the humanities for utilitarian purposes, but no, classical reasoning uses consciousness and awareness as elements of processing.

    It's only meaningless if you don't know what the philosophical or epistemological definitions of reasoning are. Which is to say, you don't know what reasoning is. So you'd think it was a meaningless statement.

    Do computers think, or do they compute?

    Is that a meaningless question to you? I'm sure given your position it's irrelevant and meaningless, surely.

    And this sort of thinking is why we have people claiming software can think and reason.

    • > > > Reasoning and consciousness are seperate(sic) concepts

      > No, they're not. But, in tech, we seem to have a culture of severing the humanities for utilitarian purposes [...] It's only meaningless if you don't know what the philosophical or epistemological definitions of reasoning are.

      As far as I'm aware, in philosophy they'd generally be considered different concepts with no consensus on whether or not one requires the other. I don't think it can be appealed to as if it's a settled matter.

      Personally I think people put "learning", "reasoning", "memory", etc. on a bit too much of a pedestal. I'm fine with saying, for instance, that if something changes to refine its future behavior in response to its experiences (touch hot stove, get hurt, avoid in future) beyond the immediate/direct effect (withdrawing hand) then it can "learn" - even for small microorganisms.

    • You have again answered with your customary condescension. Is that really necessary? Everything you write is just dripping with patronizing superiority and combatative sarcasm.

      > "classical reasoning uses consciousness and awareness as elements of processing"

      They are not the _same_ concept then.

      > It's only meaningless if you don't know what the philosophical or epistemological definitions of reasoning are. Which is to say, you don't know what reasoning is. So you'd think it was a meaningless statement.

      The problem is the only information we have is internal. So we may claim those things exist in us. But we have no way to establish if they are happening in another person, let alone in a computer.

      > Do computers think, or do they compute?

      Do humans think? How do you tell?

> It's really disingenuous for the industry to call warming tokens for output, "reasoning," as if some autocomplete before more autocomplete is all we needed to solve the issue of consciousness.

There's no obvious connection between reasoning and consciousness. It seems perfectly possible to have a model that can reason without being conscious.

Also, dismissing what these models do as "autocomplete" is extremely disingenuous. At best it implies you're completely unfamiliar with the state of the art, at worst it implies an dishonest agenda.

In terms of functional ability to reason, these models can beat a majority of humans in many scenarios.

  • Understanding is always functional, we don't study medicine before going to the doctor, we trust the expert. Like that we do with almost every topic or system. How do you "understand" a company or a complex technological or biological system? Probably nobody does end to end. We can only approximate it with abstractions and reasoning. Not even a piece of code can be understood - without execution we can't tell if it will halt or not.

  • It would require you to change the definition of reasoning, or it would require you to believe computers can think.

    A locally trained text-based foundation model is indistinguishable from autocompletion, and outputs very erratic text, and the further you train it's ability to diminish irrelevant tokens, or guide it to produce specifically formatted output, you've just moved its ability to curve fit specific requirements.

    So it may be disingenuous to you, but it does behave very much like a curve fitting search algorithm.

    • > It would require you to change the definition of reasoning

      What matters here is a functional definition of reasoning: something that can be measured. A computer can reason if it can pass the same tests that humans can pass of reasoning ability. LLMs blew past that milestone quite a while back.

      If you believe that "thinking" and "reasoning" have some sort of mystical aspect that's not captured by such tests, it's up to you to define that. But you'll quickly run into the limits of such claims, because if you want to attribute some non-functional properties to reasoning or thinking, that can't be measured, then you also can't prove that they exist. You quickly get into an intractable area of philosophy, which isn't really relevant to the question of what AI models can actually do, which is what matters.

      > it does behave very much like a curve fitting search algorithm.

      This is just silly. I can have an hours-long coding session with an LLM in which it exhibits a strong functional understanding of the codebase its working on, a strong grasp of the programming language and tools its working with, and writes hundreds or thousands of lines of working code.

      Please plot the curve that it's fitting in a case like this.

      If you really want to stick to this claim, then you also have to acknowledge that what humans do is also "behave very much like a curve fitting search algorithm." If you disagree, please explain the functional difference.

    • > or it would require you to believe computers can think.

      Unless you can show us that humans can calculate functions outside the Turing computable, it is logical to conclude that computers can be made to think due to Turing equivalence and the Church Turing thesis.

      Given we have zero evidence to suggest we can exceed the Turing computable, to suggest we can is an extraordinary claim that requires extraordinary evidence.

      A single example of a function that exceeds the Turing computable that humans can compute, will do.

      Until you come up with that example, I'll asume computer can be made to think.