Comment by rush86999
18 days ago
Basically the conclusion is LLMs don't have world models. For work that's basically done on a screen, you can make world models. Harder for other context for example visual context.
For a screen (coding, writing emails, updating docs) -> you can create world models with episodic memories that can be used as background context before making a new move (action). Many professions rely partially on email or phone (voice) so LLMs can be trained for world models in these context. Just not every context.
The key is giving episodic memory to agents with visual context about the screen and conversation context. Multiple episodes of similar context can be used to make the next move. That's what I'm building on.
That's missing a big chunk of the post: it's not just about visible / invisible information, but also the game theory dynamics of a specific problem and the information within it. (Adversarial or not? Perfect information or asymmetrical?)
All the additional information in the world isn't going to help an LLM-based AI conceal its poker-betting strategy, because it fundamentally has no concept of its adversarial opponent's mind, past echoes written in word form.
Cliche allegory of the cave, but LLM vs world is about switching from training artificial intelligence on shadows to the objects casting the shadows.
Sure, you have more data on shadows in trainable form, but it's an open question on whether you can reliably materialize a useful concept of the object from enough shadows. (Likely yes for some problems, no for others)
I do understand what you're saying, but that's impossible to resonate with real-world context, as in the real world, each person not only plays politics but also, to a degree, follows their own internal world model for self-reflection created by experience. It's highly specific and constrained to the context each person experiences.
You're missing the game theory forest unlike-in-type for the trees.
There are fundamentally different types of games that map to real world problems. See: https://en.wikipedia.org/wiki/Game_theory#Different_types_of...
The hypothesis from the post, to boil it down, is that LLMs are successful in some of these but architecturally ill-suited for others.
It's not about what or how an LLM knows, but the epistemological basis for its intelligence and what set of problems that can cover.
Game theory, at the end of the day, is also a form of teaching points that can be added to an LLM by an expert. You're cloning the expert's decision process by showing past decisions taken in a similar context. This is very specific but still has value in a business context.
That was the crux of the post for me: the assertion that there are classes of problems for which no amount of expert behavior cloning will result in dynamic expert decision making, because a viable approach to expert deciding isn't trained in the former.
It is an awfully weak signal to pick up in data.