Comment by hdndjsbbs
18 hours ago
taps the "don't anthropomorphize the LLM" sign
They don't have time preference because they don't have intent or reasoning. They can't be "reincarnated" because they're not sentient, they're a series of weights for probable next tokens.
No. They don't have time preference like us, because (wall clock) time doesn't exist for them. An LLM only "exists" when it is actively processing a prompt or generating tokens. After it is done, it stops existing as an "entity".
A real world second doesn't mean anything to the LLM from its own perspective. A second is only relevant to them as it pertains to us.
Time for LLMs is measured in tokens. That's what ticks their clock forward.
I suppose you could make time relevant for an LLM by making the LLM run in a loop that constantly polls for information. Or maybe you can keep feeding it input so much that it's constantly running and has to start filtering some of it out to function.
You could put timestamps in the prompt.
Can we maybe make it "don't anthropoCENTRIZE the LLMs" .
The inverse of anthropomorphism isn't any more sane, you see. By analogy: just because a drone is not an airplane, doesn't mean it can't fly!
Instead, just look at what the thing is doing.
LLMs absolutely have some form of intent (their current task) and some form of reasoning (what else is step-by-step doing?) . Call it simulated intent and simulated reasoning if you must.
Meanwhile they also have the property where if they have the ability to destroy all your data, they absolutely will find a way. (Or: "the probability of catastrophic action approaches certainty if the capability exists" but people can get tired of talking like that).
> LLMs absolutely have intent (their current task)
That's like saying a 2000cc 4-Cylinder Engine "has the intent to move backward". Even with a very generous definition of "intent", the component is not the system, and we're operating in context where the distinction matters. The LLM's intent is to supply "good" appended text.
If it had that kind of intent, we wouldn't be able to make it jump the rails so easily with prompt injection.
> and reasoning (what else is step-by-step doing?) .
Oh, that's easy: "Reasoning" models are just tweaking the document style so that characters engage in film noir-style internal monologues, latent text that is not usually acted-out towards the real human user.
Each iteration leaves more co-generated clues for the next iteration to pick up, reducing weird jumps and bolstering the illusion that the ephemeral character has a consistent "mind."
> That's like saying a 2000cc 4-Cylinder Engine "has the intent to move backward". Even with a very generous definition of "intent", the component is not the system, and we're operating in context where the distinction matters. The LLM's intent is to supply "good" appended text.
Fair, but typically you use a 2000cc engine in a car. Without the gearbox, drive train, wheels, chassis, etc attached, the engine sits there and makes noise. When used in practice, it does in fact make the car go forward and backward.
Strictly the model itself doesn't have intent, ofc. But in practice you add a context, memory system, some form of prompting requiring "make a plan", and especially <Skills> . In practice there's definitely -well- a very strong directionality to the whole thing.
> and bolstering the illusion that the ephemeral character has a consistent "mind."
And here I thought it allowed a next token predictor to cycle back to the beginning of the process, so that now you can use tokens that were previously "in the future". Compare eg. multi pass assemblers which use the same trick.
> LLMs absolutely have some form of intent (their current task)
They have momentum, not intent. They don’t think, build a plan internally, and then start creating tokens to achieve the plan. Echoing tokens is all there is. It’s like an avalanche or a pachinko machine, not an animal.
> some form of reasoning (what else is step-by-step doing?)
I think they reflect the reasoning that is baked into language, but go no deeper. “I am a <noun>” is much more likely than “I am a <gibberish>”. I think reasoning is more involved than this advanced game of mad libs.
Apologies, I tend to use web chats and agent harnesses a lot more than raw LLMs.
Strictly for raw models, most now do train on chain-of-thought, but the planning step may need to be prompted in the harness or your own prompt. Since the model is autoregressive, once it generates a thing that looks like a plan it will then proceed to follow said plan, since now the best predicted next tokens are tokens that adhere to it.
Or, in plain english, it's fairly easy to have an AI with something that is the practical functional equivalent of intent, and many real world applications now do.
3 replies →
I don't know if they have intent. I know it's fairly straightforward to build a harness to cause a sequence of outputs that can often satisfy a user's intent, but that's pretty different. The bones of that were doable with GPT-3.5 over three years ago, even: just ask the model to produce text that includes plans or suggests additional steps, vs just asking for direct answers. And you can train a model to more-directly generate output that effectively "simulates" that harness, but it's likewise hard for me to call that intent.
I think it’s helpful to try to use words that more precisely describe how the LLM works. For instance, “intent” ascribes a will to the process. Instead I’d say an LLM has an “orientation”, in that through prompting you point it in a particular direction in which it’s most likely to continue.
An agent has more components than just an LLM, the same way a human brain has more components than just Broca's area.
That is not that strong an argument as it seems, because we too might very well be "a series of weights for probable next tokens".
The main difference is the training part and that it's always-on.
If you claim something might "very well" be something you state you need some better proof. Otherwise we might also "very well" be living in the matrix.
That is a silly point. We very clearly are not "a series of weights for probable next tokens", as we can reason based on prior data points. LLMs cannot.
Unless you're using some mystical conception of "reason", nothing about being able to "reason based on prior data points" translates to "we very clearly are not a series of weights for probable next tokens".
And in fact LLMs can very well "reason based on prior data points". That's what a chat session is. It's just that this is transient for cost reasons.
People always say this kind of thing. Human minds are not Turing machines or able to be simulated by Turing machines. When you go about your day doing your tasks, do you require terajoules of energy? I believe it is pretty clear human thinking is not at all like a computer as we know them.
>People always say this kind of thing. Human minds are not Turing machines or able to be simulated by Turing machines
That's just a claim. Why so? Who said that's the case?
>When you go about your day doing your tasks, do you require terajoules of energy?
That's the definition of irrelevant. ENIAC needed 150 kW to do about 5,000 additions per second. A modern high-end GPU uses about 450 W to do around 80 trillion floating-point operations per second. That’s roughly 16 billion times the operation rate at about 1/333 the power, or around 5 trillion times better energy efficiency per operation.
Given such increase being possible, one can expect a future computer being able to run our mental tasks level of calculation, with similar or better efficiency than us.
Furthermore, "turing machine" is an abstraction. Modern CPUs/GPUs aren't turing machines either, in a pragmatic sense, they have a totally different architecture. And our brains have yet another architecture (more efficient at the kind of calculations they need).
What's important is computational expressiveness, and nothing you wrote proves that the brains architecture can't me modelled algorithmically and run in an equally efficient machine.
Even equally efficient is a red herring. If it's 1/10000 less efficient would it matter for whether the brain can be modelled or not? No, it would just speak to the effectiveness of our architecture.
We are much more than weights which output probable next tokens.
You are a fool if you think otherwise. Are we conscious beings? Who knows, but we’re more than a neural network outputting tokens.
Firstly, and most obviously, we aren’t LLMs, for Pete’s sake.
There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all? I don’t know, but the training humans get is coupled with the pain and embarrassment of mistakes, the ability to learn while training (since we never stop training, really), and our own desires to reach our own goals for our own reasons.
I’m not spiritual in any way, and I view all living beings as biological machines, so don’t assume that I am coming from some “higher purpose” point of view.
>We are much more than weights which output probable next tokens. You are a fool if you think otherwise. Are we conscious beings? Who knows, but we’re more than a neural network outputting tokens.
That's just stating a claim though. Why is that so?
Mine is reffering to the "brain as prediction machine" establised theory. Plus on all we know for the brain's operation (neurons, connections, firings, etc).
>There are parts of our brains which are understood (kinda) and there are parts which aren’t. Some parts are neural networks, yes. Are all?
What parts aren't? Can those parts still be algorithmically described and modelled as some information exchange/processing?
>but the training humans get is coupled with the pain and embarrassment of mistakes
Those are versions of negative feedback. We can do similar things to neural networks (including human preference feedback, penalties, and low scores).
>the ability to learn while training (since we never stop training, really)
I already covered that: "The main difference is the training part and that it's always-on."
We do have NNs that are continuously training and updating weights (even in production).
For big LLMs it's impractical because of the cost, otherwise totally doable. In fact, a chat session kind of does that too, but it's transient.
They're not artificial intelligence neural networks.
They're biological neural networks. Brains are made of neurons (which Do The Thing... mysteriously, somehow. Papers are inconclusive!) , Glia Cells (which support the neurons), and also several other tissues for (obvious?) things like blood vessels, which you need to power the whole thing, and other such management hardware.
Bioneurons are a bit more powerful than what artificial intelligence folks call 'neurons' these days. They have built in computation and learning capabilities. For some of them, you need hundreds of AI neurons to simulate their function even partially. And there's still bits people don't quite get about them.
But weights and prediction? That's the next emergence level up, we're not talking about hardware there. That said, the biological mechanisms aren't fully elucidated, so I bet there's still some surprises there.
We very obviously are not just a series of weights for probable next tokens. Like seriously, you can even ask an LLM and it will tell you our brains work differently to it, and that’s not even including the possibility that we have a soul or any other spiritual substrait.
>We very obviously are not just a series of weights for probable next tokens.
How exactly? Except via handwaving? I refer to the "brain as prediction machine theory" which is the dominant one atm.
>you can even ask an LLM and it will tell you our brains work differently to it
It will just tell me platitudes based on weights of the millions of books and articles and such on its training. Kind of like what a human would tell me.
>and that’s not even including the possibility that we have a soul or any other spiritual substrait.
That's good, because I wasn't including it either.
1 reply →
Its really just a matter of degrees. There are 1 million, 1 million, 1 trillion parameter LLMs... and you keep scaling those parameters and you eventually get to humans. But it's still probable next tokens (decisions) based on previous tokens (experience).
12 replies →
Our brains work differently, yes. What evidence do you have that our brains are not functionally equivalent to a series of weights being used to predict the next token?
I'm not claiming that to be the case, merely pointing out that you don't appear to have a reasonable claim to the contrary.
> not even including the possibility that we have a soul or any other spiritual substrait.
If we're going to veer off into mysticism then the LLM discussion is also going to get a lot weirder. Perhaps we ought to stick to a materialist scientific approach?
8 replies →