Comment by bluegatty
20 days ago
That's not how training works - adjusting model weights to memorize a single data item is not going to fly.
Model weights store abilities, not facts - generally.
Unless the fact is very widely used and widely known, with a ton of context around it.
The model can learn the day JFK died because there are millions of sparse examples of how that information exists in the world, but when you're working on a problem, you might have 1 concern to 'memorize'.
That's going to be something different than adjusting model weights as we understand them today.
LLMs are not mammals either, it's helpful analogy in terms of 'what a human might find useful' but not necessary in the context of actual llm architecture.
The fact is - we don't have memory sorted out architecturally - it's either 'context or weights' and that's that.
Also critically: Humans do not remember the details of the face. Not remotely. They're able to associate it with a person and name 'if they see it again' - but that's different than some kind of excellent recall. Ask them to describe features in detail and maybe we can't do it.
You can see in this instance, this may be related to kind of 'soft lookup' aka associating an input with other bits of information which 'rise to the fore' as possibly useful.
But overall, yes, it's fair to take the position that we'll have to 'learn from context in some way'.
Also, with regards to faces, that's kind of what I'm getting at - we don't have grid cells for faces, there seem to be discrete, functional, evolutionary structures and capabilities that combine in ways we're not consciously aware of to provide abilities. We're reflexively able to memorize faces, but to bring that to consciousness isn't automatic. There've been amnesia and lesion and other injury studies where people with face blindness get stress or anxiety, or relief, when recognizing a face, but they aren't consciously aware. A doctor, or person they didn't like, showing up caused stress spikes, but they couldn't tell you who they were or their name, and the same with family members- they get a physiological, hormonal response as if they recognized a friend or foe, but it never rises to the level of conscious recognition.
There do seem to be complex cells that allow association with a recognizable face, person, icon, object, or distinctive thing. Face cells apply equally to abstractions like logos or UI elements in an app as they do to people, famous animals, unique audio stings, etc. Split brain patients also demonstrate amazing strangeness with memory and subconscious responses.
There are all sorts of layers to human memory, beyond just short term, long term, REM, memory palaces, and so forth, and so there's no simple singular function of "memory" in biological brains, but a suite of different strategies and a pipeline that roughly slots into the fuzzy bucket words we use for them today.
It's not just faces. When recognizing objects in the environment, we normally filter out a great number of details going through the visual cortex - by the time information from our eyes hits the level of conscious awareness, it's more of a scene graph.
Table; chair behind and little to the left of the chair; plant on table
Most people won't really have conscious access to all the details that we use in recognizing objects - but that is a skill that can be consciously developed, as artists and painters do. A non-artist would be able to identify most of the details, but not all (I would be really bad compared to an actual artist with colors and spatial relationships), and I wouldn't be able to enumerate the important details in a way that makes any kind of sense for forming a recognizable scene.
So it follows from that that our ability to recognize faces is not purely - or even primarily - an attribute of what we would normally call "memory", certainly in the sense of conscious memory where we can recall details on demand. Like you alluded to re: mammals and spaces, we're really good at identifying, categorizing, and recognizing new forms of structure.
I suspect we're going to need hypernetworks of some sort - dynamically generated weights, with the hypernet weights getting the dream-like reconsolidation and mapping into the model at large, and layers or entire experts generated from the hypernets on the fly, a degree removed from the direct-from-weights inference being done now. I've been following some of the token-free latent reasoning and other discussions around CoT, other reasoning scaffolding, and so forth, and you just can't overcome the missing puzzle piece problem elegantly unless you have online memory. In the context of millions of concurrent users, that also becomes a nightmare. Having a pipeline, with a sort of intermediate memory, constructive and dynamic to allow resolution of problems requiring integration into memorized concepts and functions, but held out for curation and stability.
It's an absolutely enormous problem, and I'm excited that it seems to be one of the primary research efforts kicking off this year. It could be a very huge capabilities step change.
Can I subscribe to your newsletter? You seem to be pretty plugged in to current research.
Yes, so I think that's a fine thought, I don't think it fits into LLM architecture.
Also, weirdly, even Lecun etc. are barely talking about this, they're thinking about 'world models etc'.
I think what you're talking about is maybe 'the most important thing' right now, and frankly, it's almost like an issue of 'Engineering'.
Like - its when you work very intently with the models so this 'issue' become much more prominent.
Your 'instinct' for this problem is probably an expression of 'very nuanced use' I'm going to guess!
So in a way, it's as much Engineering as it is theoretical?
Anyhow - so yes - but - probably not LLM weights. Probably.
I'll add a small thing: the way that Claude Code keeps the LLM 'on track' is by reminding it! Literally, it injects little 'TODO reminders' with some prompts, which is kind of ... simple!
I worked a bit with 'steering probes' ... and there's a related opportunity there - to 'inject' memory and control operations along those lines. Just as a starting point for a least one architectural motivation.
Not to forget we will need thousands of examples for the models to extract abilities the sample efficiency of these models is quite poor.
> That's not how training works - adjusting model weights to memorize a single data item is not going to fly.
Apologies; I think I got us all kind of off-track in this comment thread by stretching the definition of the term "fine-tuning" in my ancestor comment above.
Actual fine-tuning of the base model's weights (as one would do to customize a base model into a domain-specific model) works the way you're talking about, yes. The backprop from an individual training document would be a drop in the ocean; a "memory" so weak that, unless it touched some bizarre part of the latent vector-space that no other training document has so far affected (and so is until then all-zero), would be extremely unlikely to affect output, let alone create specific recall of the input.
And a shared, global incremental fine-tune of the model to "add memories" would be a hare-brained idea, anyway. Not even just that it wouldn't work, but that if it did work, it would be a security catastrophe, because now the model would be able to recall all this information gleaned from random tenant users' private chat transcripts, with nothing to differentiate that info from any other info to enable the model (or its inference framework) to compartmentalize it / prevent cross-tenant info leaks.
But let me rephrase what I was saying before:
> there's a way to take many transcripts of inference over a period, and convert/distil them together into an incremental-update training dataset (for memory, not for RLHF), that a model can be fine-tuned on as an offline batch process every day/week, such that a new version of the model can come out daily/weekly that hard-remembers everything you told it
As:
> for a given tenant user, there's a way to take all of their inference transcripts over a given period, and convert/distil them together into an incremental-update training dataset (for memory, not for RLHF), that a LoRA can be rebuilt (or itself fine-tuned) on. And that the work of all of these per-tenant LoRA rebuilds can occur asynchronously / "offline", on a batch-processing training cluster, gradually over the course of the day/week; such that at least once per day/week (presuming the tenant-user has any updated data to ingest), each tenant-user will get the effect of their own memory-LoRA being swapped out for a newer one.
---
Note how this is essentially what Apple claimed they would be doing with Apple Intelligence, re: "personal context."
The idea (that I don't think has ever come to fruition as stated—correct me if I'm wrong?) is that Apple would:
1. have your macOS and iOS devices spend some of their idle-on-charge CPU power to extract and normalize training fulltexts from whatever would be considered the user's "documents" — notes, emails, photos, maybe random text files on disk, etc.; and shove these fulltexts into some kind of iCloud-persisted database, where the fulltexts are PKI-encrypted such that only Apple's Private Compute Cloud (PCC) can decode them;
2. have the PCC produce a new/updated memory LoRA (or rather, six of them, because they need to separately imbue each of their domain-specific model "adapter" LoRAs with your personal-context memories);
3. and, once ready, have all your iCloud-account-synced devices to download the new versions of these memory-imbued adapter LoRAs.
---
And this is actually unnecessarily complex/circuitous for a cloud-hosted chat model. The ChatGPT/Claude/etc version of this architecture could be far simpler.
For a cloud-hosted chat model, you don't need a local agent to extract context from your devices; the context is just "past cloud-persisted chat transcripts." (But if you want "personal context" in the model, you could still get it, via an OpenClaw-style "personal agent"; such agents already essentially eat your files and spit them out external memories/RAGs/etc; the only change would be spitting them out into plain-old hidden-session chat transcripts instead, so as to influence the memories of the model they're running on.)
And you don't need a special securely-oblivious cluster to process that data, since unlike "Apple looking at the data on your computer" (which would upset literally everybody), nobody has any kind of expectation that e.g. OpenAI staff can't look at your ChatGPT conversation transcripts.
And cloud-hosted chat models don't really "do" domain-specific adapters (thus the whole "GPT" thing); so you only need to train one memory-LoRA per model. (Though I suppose that might still lead to training several LoRAs per user, if you're relying on smart routing to different models within a model family to save costs.)
And you don't need to distribute the memory-LoRAs back to client devices; as they can just live in an object store and get just-in-time loaded by the inference framework on a given node at the moment it begins an inference token-emission loop for a specific user. (Which might thus cause the inference cluster's routing to benefit from sticky sessions in a way it didn't before—but you don't need it; the LoRAs would likely be small enough to fetch and load within the ~second of delay it takes these cloud-hosted models to allocate you a node.)
This is a fine thought, I'm reluctant about it. It could work, I don't think it's obvious. It's very, very hard to know what to train for and not, and this still leaves the 'fact v. skill' problem - even LORA won't enable a model to remember your favourite lunch place.
This is kind of an existential problem with context I think. Maybe we need new architectures.