Comment by CommieBobDole

1 day ago

The fact that a LLM is essentially immutable would be my biggest argument against consciousness or self-awareness.

It's a big file with a bunch of coordinates describing spatial relationships between tokens. When you give it a prompt, it uses those relationships to generate a string of tokens that is a statistically likely response to that prompt, then it stops. It's not changed by the experience. It doesn't remember anything. It doesn't sit around thinking on its own.

Even if the model itself were extremely complex, it's hard to imagine a definition of consciousness that includes something that doesn't remember and can't change.

68 comments

CommieBobDole

layer8 1 day ago

There are people whose brains don’t form new memories anymore after an accident or surgery, and they eternally live in the time before it happened, and have no memory of what happened a minute ago. Still they are conscious.

eberkund 1 day ago
I think it's a little more complicated than that. In a 50 First Dates type of scenario, their ability to form certain types of memories is damaged, not non-existent. And I would argue that with enough brain damage someone like an extreme lobotomy victim may stop being considered conscious.
- layer8 1 day ago
  
  I’m not familiar with 50 First Dates, I was thinking of cases like Clive Wearing [0]. I would agree that consciousness requires some sort of ultra-short-term working memory, but I also think that mechanisms similar to CoT loops can conceivably fulfill that role. Today’s generative AIs consist of more than just the static network-of-weights model.
  [0] https://en.wikipedia.org/wiki/Clive_Wearing#Amnesia
  
  1 reply →
knollimar 17 hours ago
I was like this for a bit and you still have memories from like 30 seconds to minutes ago, but after that you have a cliff where you don't remember.
I don't think LLMs structurally even get the 30 seconds part. It's literally 0 for them.
- mft_ 16 hours ago
  
  I'd argue that the context window is analogous to short-term memory. It's functional but limited in duration, and if you overload it, it starts to fail.
  It's the long-term memory (i.e. learned experiences feeding back and directly altering the content of the core brain, or model) that is missing.
  
  1 reply →
- layer8 16 hours ago
  
  It’s nonzero, because they carry state while performing inference, and in the surrounding processes like chain-of-thought and mixture-of-experts.
  
  1 reply →
krupan 1 day ago
They are conscious because even for short periods of time they do form memories and those change them even if only briefly. They think on their own too. It is a very limited level of consciousness though.
- Taek 1 day ago
  
  Is that any different from an LLM having a context window?
  
  21 replies →
- xyzsparetimexyz 15 hours ago
  
  Okay but this state is formed in text. Text isn't conscious
  
  1 reply →
seizethecheese 1 day ago
Interesting point but even those people’s brains aren’t immutable. The have habit change without memory.
- layer8 18 hours ago
  
  True, but I don’t see how that relates to consciousness. An LLM being continuously RLHF-trained also changes its habits; that alone doesn’t make it conscious.
- bulhabulha 1 day ago
  
  The starting file may be immutable, but the whole processing of that file is very dynamic and intense. Maybe, if there is some consciousness, it lies somewhere during that processing.
flashman 4 hours ago

vast oversimplification of the experience of brain damage
MagicMoonlight 8 hours ago

Someone getting in an accident that chops their leg off doesn’t mean humans don’t have legs. Come on man.
micromacrofoot 1 day ago
they still have memory, just not new ones - they lived experiences
- layer8 18 hours ago
  
  An LLM’s training could be seen as lived experience, and the fact that LLMs can output long sequences from their training material can be interpreted as them remembering those parts.
  Also, how does that relate to consciousness? I don’t think that past episodic memory is necessary for consciousness.

stabbles 16 hours ago

A medicine for those who anthropomorphize LLMs is to run the LLMs deterministically (without randomness and memory files).

It feels very unnatural to get the same conversation verbatim at a different point in time.

Jtarii 15 hours ago
Humans are also subject to determinism, there is just no way to put a brain back to the exact same starting conditions.
- Jensson 11 hours ago
  
  There is, you talk to an Alzheimer patient and its like that, and it doesn't feel like talking to a human any more. An Alzheimer patient isn't cured by adding some input noise to stop them from repeating conversations, they are still unable to learn, just like an LLM.
kranke155 9 hours ago

this just means they are incomplete, like a baby that has no long term memory. I think the baby analogy will hold up as we build more and more capability.
twobitshifter 12 hours ago

Or it feels just like talking to my grandpa.

in-silico 1 day ago

> It's not changed by the experience

The entire file is not changed, but the KV cache is.

> It doesn't remember anything

The model definitely remembers previous exchanges within the same conversation.

rmunn 1 day ago
> The model definitely remembers previous exchanges within the same conversation.
No it doesn't. They get added to its context, and it reads them afresh when answering the next question. That's not remembering.
If your short-term memory completely malfunctioned one day, so you had no ability to remember what was said to you a minute ago, then you would have to find workarounds. For example, you could write down everything someone says to you, then read your notes of the previous exchanges in that conversation in order to continue the conversation. That would be a good way to work around the fact that your short-term memory was broken. And if your notes were invisible to other people and you could read them really fast, then you could even make most people believe that you remembered what they said a minute ago. But you don't actually have a working memory, you're just writing down what they said and re-reading it while coming up with your next response.
That's exactly what LLMs do. That's not memory.
- ACCount37 9 hours ago
  
  Continuous learning allows past behavior and past inputs to influence future inputs and future behavior. In humans.
  Attention over KV cache allows past behavior and past inputs to influence future inputs and future behavior. In LLMs.
  Until the cache runs out, that is. But even then, you could totally use any of 9000 methods of cache compression, truncation, dropping or streaming and get away with it.
  The difference between continuous learning and in-context learning seems to be in capacity, not in principle. Both are doing a similar thing, but one has more length and depth to it.
  
  2 replies →
- in-silico 1 day ago
  
  This is really semantics, but I wouldn't call attending to the KV cache re-reading the context.
  The model takes in the context, encodes it into a "memory" (the KV cache), and accesses that memory later. That fact doesn't change just because the KV cache grows in size with the context.
  I don't know what memory would look like other than an encode-retrieve loop.
  Relevant: Transformers are Multi-State RNNs - https://arxiv.org/abs/2401.06104
fipar 1 day ago
Not the model though. The model really only takes input text and produces output text. Memory within a conversation is achieved by the harness adding the conversation (or parts of it) to the input text. The LLM itself has no memory, it’s the augmented system of several orchestrated LLM calls that does.
- nomel 8 hours ago
  
  Wait until you hear about the hippocampus!!! [1]
  I don't think physical integration within one contained is relevant to system level behavior.
  [1] https://en.wikipedia.org/wiki/Neuroanatomy_of_memory
  
  3 replies →
CommieBobDole 1 day ago

Right, but that's still external to the LLM, it's just a KV cache that's stored on the provider side for performance reasons, so that the client doesn't have to re-send the whole chat history with every subsequent call in the conversation.
It still generates every response using the model's pristine state with every new API call; whether the context is provided from the client or from a colocated cache server doesn't really change that.
nprateem 1 day ago
> The model definitely remembers previous exchanges within the same conversation.
Christ HN isn't what it used to be
- in-silico 1 day ago
  
  Care to elaborate?

gringoDan 1 day ago

You might be interested in Erik Hoel's more formal version of this argument: https://www.theintrinsicperspective.com/p/proving-literally-...

supertroop 1 day ago

Reinforcement learning changes the model. So it can and does change and remember based on experience. Eventually reinforcement learning can happen in real time.

george_max 1 day ago
But is the model aware of the training? Unless you hook the model up to an MCP server, or something similar, and have it analyze the RL changes, it will not know if it has changed or not. Even if it is real-time RL, it is not aware of the previous state.
- supertroop 20 hours ago
  
  Why not? Why can’t part of its previous state be part of the training?
- twobitshifter 12 hours ago
  
  Are you aware of each of your dreams from last week or last night even?

mortsnort 13 hours ago

One can fine-tune the models and we do get a new version of Opus every few months.

aaroninsf 5 hours ago

That definition is in fact the predominant one today in serious circles: consciousness proper is not itself inclusive of the things which consider to define a continuous coherent self.

I.e. the "self" is not the same as what it means to experience consciousness.

There are for example well characterized examples of memory disruption under the influence of various drugs (e.g. as used intentionally in anesthesia); and neurological conditions which produce various kinds of amnesia.

Do these conditions mean someone is not conscious? We have the luxury of asking people directly.

More unsettling edges yet include things like so-called "split brain" patients or people suffering form serious psychological conditions like so-called "multiple personalities." Psychology does get great mileage out pathology!

yesitcan 1 day ago

But you could argue the brain is just a bunch of coordinates describing spatial relationships between tokens too.

- average Hacker News response