Comment by libraryofbabel

2 months ago

This is the 2023 take on LLMs. It still gets repeated a lot. But it doesn’t really hold up anymore - it’s more complicated than that. Don’t let some factoid about how they are pretrained on autocomplete-like next token prediction fool you into thinking you understand what is going on in that trillion parameter neural network.

Sure, LLMs do not think like humans and they may not have human-level creativity. Sometimes they hallucinate. But they can absolutely solve new problems that aren’t in their training set, e.g. some rather difficult problems on the last Mathematical Olympiad. They don’t just regurgitate remixes of their training data. If you don’t believe this, you really need to spend more time with the latest SotA models like Opus 4.5 or Gemini 3.

Nontrivial emergent behavior is a thing. It will only get more impressive. That doesn’t make LLMs like humans (and we shouldn’t anthropomorphize them) but they are not “autocomplete on steroids” anymore either.

46 comments

libraryofbabel

root_axis 2 months ago

> Don’t let some factoid about how they are pretrained on autocomplete-like next token prediction fool you into thinking you understand what is going on in that trillion parameter neural network.

This is just an appeal to complexity, not a rebuttal to the critique of likening an LLM to a human brain.

> they are not “autocomplete on steroids” anymore either.

Yes, they are. The steroids are just even more powerful. By refining training data quality, increasing parameter size, and increasing context length we can squeeze more utility out of LLMs than ever before, but ultimately, Opus 4.5 is the same thing as GPT2, it's only that coherence lasts a few pages rather than a few sentences.

int_19h 2 months ago

> ultimately, Opus 4.5 is the same thing as GPT2, it's only that coherence lasts a few pages rather than a few sentences.
This tells me that you haven't really used Opus 4.5 at all.
baq 2 months ago
First, this is completely ignoring text diffusion and nano banana.
Second, to autocomplete the name of the killer in a detective book outside of the training set requires following and at least some understanding of the plot.
dash2 2 months ago
This would be true if all training were based on sentence completion. But training involving RLHF and RLAIF is increasingly important, isn't it?
- root_axis 2 months ago
  
  Reinforcement learning is a technique for adjusting weights, but it does not alter the architecture of the model. No matter how much RL you do, you still retain all the fundamental limitations of next-token prediction (e.g. context exhaustion, hallucinations, prompt injection vulnerability etc)
  
  1 reply →
libraryofbabel 2 months ago

> This is just an appeal to complexity, not a rebuttal to the critique of likening an LLM to a human brain
I wasn’t arguing that LLMs are like a human brain. Of course they aren’t. I said twice in my original post that they aren’t like humans. But “like a human brain” and “autocomplete on steroids” aren’t the only two choices here.
As for appealing to complexity, well, let’s call it more like an appeal to humility in the face of complexity. My basic claim is this:
1) It is a trap to reason from model architecture alone to make claims about what LLMs can and can’t do.
2) The specific version of this in GP that I was objecting to was: LLMs are just transformers that do next token prediction, therefore they cannot solve novel problems and just regurgitate their training data. This is provably true or false, if we agree on a reasonable definition of novel problems.
The reason I believe this is that back in 2023 I (like many of us) used LLM architecture to argue that LLMs had all sorts of limitations around the kind of code they could write, the tasks they could do, the math problems they could solve. At the end of 2025, SotA LLMs have refuted most of these claims by being able to do the tasks I thought they’d never be able to do. That was a big surprise to a lot us in the industry. It still surprises me every day. The facts changed, and I changed my opinion.
So I would ask you: what kind of task do you think LLMs aren’t capable of doing, reasoning from their architecture?
I was also going to mention RL, as I think that is the key differentiator that makes the “knowledge” in the SotA LLMs right now qualitatively different from GPT2. But other posters already made that point.
This topic arouses strong reactions. I already had one poster (since apparently downvoted into oblivion) accuse me of “magical thinking” and “LLM-induced-psychosis”! And I thought I was just making the rather uncontroversial point that things may be more complicated than we all thought in 2023. For what it’s worth, I do believe LLMs probably have limitations (like they’re not going to lead to AGI and are never going to do mathematics like Terence Tao) and I also think we’re in a huge bubble and a lot of people are going to lose their shirts. But I think we all owe it to ourselves to take LLMs seriously as well. Saying “Opus 4.5 is the same thing as GPT2” isn’t really a pathway to do that, it’s just a convenient way to avoid grappling with the hard questions.
nl 2 months ago

This ignores that reinforcement learning radically changes the training objective
A4ET8a8uTh0_v2 2 months ago
But.. and I am not asking it for giggles, does it mean humans are giant autocomplete machines?
- root_axis 2 months ago
  
  Not at all. Why would it?
  
  14 replies →
NiloCK 2 months ago
First: a selection mechanism is just a selection mechanism, and it shouldn't confuse the observation of an emergent, tangential capabilities.
Probably you believe that humans have something called intelligence, but the pressure that produced it - the likelihood of specific genetic material to replicate - it is much more tangential to intelligence than next-token-prediction.
I doubt many alien civilizations would look at us and say "not intelligent - they're just genetic information replication on steroids".
Second: modern models also under go a ton of post-training now. RLHF, mechanized fine-tuning on specific use cases, etc etc. It's just not correct that token-prediction loss function is "the whole thing".
- root_axis 2 months ago
  
  > First: a selection mechanism is just a selection mechanism, and it shouldn't confuse the observation of an emergent, tangential capabilities.
  Invoking terms like "selection mechanism" is begging the question because it implicitly likens next-token-prediction training to natural selection, but in reality the two are so fundamentally different that the analogy only has metaphorical meaning. Even at a conceptual level, gradient descent gradually honing in on a known target is comically trivial compared to the blind filter of natural selection sorting out the chaos of chemical biology. It's like comparing legos to DNA.
  > Second: modern models also under go a ton of post-training now. RLHF, mechanized fine-tuning on specific use cases, etc etc. It's just not correct that token-prediction loss function is "the whole thing".
  RL is still token prediction, it's just a technique for adjusting the weights to align with predictions that you can't model a loss function for in per-training. When RL rewards good output, it's increasing the statistical strength of the model for an arbitrary purpose, but ultimately what is achieved is still a brute force quadratic lookup for every token in the context.

vachina 2 months ago

I use enterprise LLM provided by work, working on very proprietary codebase on a semi esoteric language. My impression is it is still a very big autocompletion machine.

You still need to hand hold it all the way as it is only capable of regurgitating the tiny amount of code patterns it saw in the public. As opposed to say a Python project.

libraryofbabel 2 months ago

What model is your “enterprise LLM”?
But regardless, I don’t think anyone is claiming that LLMs can magically do things that aren’t in their training data or context window. Obviously not: they can’t learn on the job and the permanent knowledge they have is frozen in during training.

deadbolt 2 months ago

As someone who still might have a '2023 take on LLMs', even though I use them often at work, where would you recommend I look to learn more about what a '2025 LLM' is, and how they operate differently?

krackers 2 months ago

Papers on mechanistic interpratability and representation engineering, e.g. from Anthropic would be a good start.
otabdeveloper4 2 months ago

Don't bother. This bubble will pop in two years, you don't want to look back on your old comments in shame in three.

otabdeveloper4 2 months ago

> it’s more complicated than that.

No it isn't.

> ...fool you into thinking you understand what is going on in that trillion parameter neural network.

It's just matrix multiplication and logistic regression, nothing more.

hackinthebochs 2 months ago
LLMs are a general purpose computing paradigm. LLMs are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. Training LLMs discovers programs that well reproduce the input sequence. Roughly the same architecture can generate passable images, music, or even video.
The sequence of matrix multiplications are the high level constraint on the space of programs discoverable. But the specific parameters discovered are what determines the specifics of information flow through the network and hence what program is defined. The complexity of the trained network is emergent, meaning the internal complexity far surpasses that of the course-grained description of the high level matmul sequences. LLMs are not just matmuls and logits.
[1] https://x.com/karpathy/status/1582807367988654081
- otabdeveloper4 2 months ago
  
  > LLMs are a general purpose computing paradigm.
  Yes, so is logistic regression.
  
  5 replies →
libraryofbabel 2 months ago

You really think I didn't already know how LLMs are put together when I wrote my comment? I've implemented these things from scratch in PyTorch. Of course I know the building blocks.
And if you want to get pedantic and technical, you didn't even get the reductionism right! Modern LLMs don't use the logistic regression sigmoid function for network activation nonlinearity anymore, they use things like ReLU or GELU. You're about 15 years behind.
Reductionism is counterproductive in biology ("human brains are voltage spikes across membranes, nothing more") and it's counterproductive here as well. LLMs have nontrivial emergent behavior. The interesting questions are all around what that behavior is and how it arises in the network during training, and if you refuse to engage beyond bare reductionism you won't even be able to ask those questions, let alone answer them.

beernet 2 months ago

>> Sometimes they hallucinate.

For someone speaking as you knew everything, you appear to know very little. Every LLM completion is a "hallucination", some of them just happen to be factually correct.

Am4TIfIsER0ppos 2 months ago
I can say "I don't know" in response to a question. Can an LLM?
- Smaug123 2 months ago
  
  This is one of the easiest questions in the world to answer. My first try on the smallest and fastest model it was convenient to access, GPT-5.2 Instant: https://chatgpt.com/share/69468764-01cc-8008-b734-0fb55fd7ef...
  > What did I have for breakfast this morning?
  > I don’t know what you had for breakfast this morning…
- nl 2 months ago
  
  Yes, frequently.
  Most modern post training setups encourage this.
  It isn't 2023 anymore.

dingnuts 2 months ago

[dead]