Comment by slashdave
15 hours ago
This is a video model, not a world model. Start learning on this, and cascading errors will inevitably creep into all downstream products.
You cannot invent data.
15 hours ago
This is a video model, not a world model. Start learning on this, and cascading errors will inevitably creep into all downstream products.
You cannot invent data.
Related: https://arxiv.org/abs/2601.03220
This is a paper that recently got popular ish and discusses the counter to your viewpoint.
> Paradox 1: Information cannot be increased by deterministic processes. For both Shannon entropy and Kolmogorov complexity, deterministic transformations cannot meaningfully increase the information content of an object. And yet, we use pseudorandom number generators to produce randomness, synthetic data improves model capabilities, mathematicians can derive new knowledge by reasoning from axioms without external information, dynamical systems produce emergent phenomena, and self-play loops like AlphaZero learn sophisticated strategies from games
In theory yes, something like the rules of chess should be enough for these mythical perfect reasoners that show up in math riddles to deduce everything that *can* be known about the game. And similarly a math textbook is no more interesting than a book with the words true and false and a bunch of true => true statements in it.
But I don't think this is the case in practice. There is something about rolling things out and leveraging the results you see that seems to have useful information in it even if the roll out is fully characterizable.
Interesting paper, thanks! But, the authors escape the three paradoxes they present by introducing training limits (compute, factorization, distribution). Kind of a different problem here.
What I object to are the "scaling maximalists" who believe that if enough training data were available, that complicated concepts like a world model will just spontaneously emerge during training. To then pile on synthetic data from a general-purpose generative model as a solution to the lack of training data becomes even more untenable.
They have a feature where you can take a photo and create a world from that.
If instead of a photo you have a video feed, this is one step closer to implementing subjective experience.
It's not a subjective experience. It's the mimicry of a subjective experience.
Given that the video is fully interactive and lets you move around (in a “world” if you will) I don’t think it’s a stretch to call it a world model. It must have at least some notion of physics, cause and effect, etc etc in order to achieve what it does.
No, it actually needs none of that.
How would it do what it does without those things?
2 replies →