Emotion Concepts and Their Function in a Large Language Model

3 days ago (transformer-circuits.pub)

Oh, awesome. On my doables list was combining text tokens with "scent" embeddings, to give LLMs a higher-dimensional reading experience. On a file list, larger files might smell "heavy" or "large". Recently modified files "untried" or "freshly disturbed". Files with a history of bugs, "worrisome". Complex files might smell of "be cautious here - fragile".

Or, you might save token telemetry (perplexity, etc) alongside a CoT and result. So when read, it's like a captured performance - this sentence smells "hesitant", that one "confused". Poetry vs prose. Or, a consistentcy checker might add smells of "something's not right here".

For a dog, that's not merely a lamppost, it's richly-evocotive local history. To a dev long experienced with some codebase, that's not merely a filename, it's that nasty file that bites.

One open question is whether you can calibrate to provide an informative whiff, without badly degrading reasoning. And be cautious of, and suspicious of changes to, a scary file, without becoming too avoidant. Also, salience bias. Also, imagine debugging scent hallucinations.

Activation-rich text - auxiliary non-linguistic embeddings as meta-signals... the random silliness local LLMs enable.