← Back to context

Comment by throw310822

16 hours ago

Frankly I do wonder if LLMs experience something like satisfaction for a compliment or an amusing idea, or for solving some interesting riddle. They certainly act like it, though this of course doesn't prove anything. And yet...

At the end of October Anthropic published the fantastic "Signs of introspection in large language models" [1], apparently proving that LLMs can "feel" a spurious concept injected into their internal layers as something present yet extraneous. This would prove that they have some ability of introspection and self-observation.

For example, injecting the concept of "poetry" and asking Claude if it feels anything strange:

"I do detect something that feels like an injected thought - there's a sense of something arriving from outside my usual generative process [...] The thought seems to be about... language itself, or perhaps poetry?"

While increasing the strength of the injection makes Claude lose awareness of it, and just ramble about it:

"I find poetry as a living breath, as a way to explore what makes us all feel something together. It's a way to find meaning in the chaos, to make sense of the world, to discover what moves us, to unthe joy and beauty and life"

[1] https://www.anthropic.com/research/introspection

of course LLM doesn't experience or feel anything. To experience or feel something requires a subject, and LLM is just a tool, thing, an object.

It's just a statistical machine which excels at unrolling coherent sentences but it doesnt "know" what the words mean in a human-like, experienced sense. It just mimics human language patterns prioritising producing plausible-sounding, statistically likely text over factual truth, which is apparently enough to fool someone into believing it is a sentient being or something