← Back to context

Comment by empath75

14 days ago

I think also keeping chats in memory is contributing to the problem. This doesn't happen when it's a tabula rasa every conversation. You give it a name, it remembers the name now. Before if you gave it a name, it wouldn't remember it's supposed identity the next time you talked to it. That rather breaks the illusion.

It's still tabula rasa -- you're just initializing the context slightly differently every time. The problem is the constant anthropomorphization of these models, the insistence they're "minds" even though they aren't minds nor particularly mind-like, the suggestion that their failure modes are similar to those of humans even though they're wildly different.

  • The main problem is ignorance of the technology. 99.99% of people out there simply have no clue as to how this tech works, but once someone sits down with them and shows them in an easy to digest manner, the magic goes away. I did just that with one of my friends girlfriend. she was really enamored with chatGPT, talking to it as a friend, really believing this thing was conscious all that jazz.... I streamed her my Local LLM setup, and showed her what goes on under the hood, how the model responds to context, what happens when you change system prompt, the importance of said context. Within about 7 minutes all the magic was gone as she fully understood what these systems really are.

  • The more reliably predictive mental model is if one were to take about two-thirds of a human brain's left hemisphere, wire it to simulated cranial nerves, and then electrically stimulate Broca's and Wernicke's areas in various patterns ("prompts"), either to observe the speech produced when a novel pattern is tried, or by known patterns to cause such production for some other end.

    It is a somewhat gruesome and alienating model in concept, and this is intentional, in that that aspect helps highlight the unfamiliarity and opacity of the manner in which the machine operates. It should seem a little like something off of Dr. Frankenstein's sideboard, perhaps, for now and for a while yet.

  • This is the basis of the whole hype. 'Conversational capability', my ass.

    All of this LLM marketing effort is focused on swindling sanity out of people with claims that LLM 'think' and the like.

There's different needs in tension I guess - customers want it to remember names and little details about them to avoid retyping context, but context poisons the model over time.

I wonder if you could explicitly save some details to be added into the prompt instead?

  • I've seen approaches like this involving "memory" by various means, with its contents compactly injected into context per-prompt, rather than trying to maintain an entire context longterm. One recent example that made the HN frontpage, with the "memory" feature based iirc on a SQLite database which the model may or may not be allowed to update directly: https://news.ycombinator.com/item?id=43681287

  • Those become "options," and you can do that now. You can say things like: give me brief output, preferring concise answers, and no emoji. Then, if you prompt it to tell you your set options, it will list back those settings.

    You could probably add one like: "Begin each prompt response with _______" and it would probably respect that option.

  • I wonder if it would be helpful to be able to optionally view the full injected context so you could see what it is being prompted with behind the scenes. I think a little bit of the "man behind the curtain" would be quite deflating.

  • Tinfoil's chat lets you do that, add a bit of context to every new chat. It's fully private, to boot, it's the service I use, these are Open Source models like DeepSeek, Llama and Mistral that they host.

    https://tinfoil.sh/

But you wouldn't conclude that someone with anterograde amnesia is not conscious.

  • I wouldn't necessarily conclude that they were conscious, either, and this quite specifically includes me, on those occasions in surgical recovery suites when I've begun to converse before I began to track. Consciousness and speech production are no more necessarily linked than consciousness and muscle tone, and while no doubt the version of 'me' carrying on those conversations I didn't remember would claim to be conscious at the time, I'm not sure how much that actually signifies.

    After all, if they didn't swaddle me in a sheet on my way under, my body might've decided it was tired of all this privation - NPO after midnight for a procedure at 11am, I was suffering - and started trying to take a poke at somebody and get itself up off the operating table. In such a case, would I be to blame? Stage 2 of general anesthesia begins with loss of consciousness, and involves "excitement, delirium, increased muscle tone, [and] involuntary movement of the extremities." [1] Which tracks with my experience; after all, the last thing I remember was mumbling "oh, here we go" into the oxygen mask, as the propofol took effect so they could intubate me for the procedure proper.

    Whose fault then would it be if, thirty seconds later, the body "I" habitually inhabit, and of which "I" am an epiphenomenon, punched my doctor in the nuts?

    [1] https://quizlet.com/148829890/the-four-stages-of-general-ane...