Comment by Rover222

4 months ago

I don't think you grasp what I'm saying? I'm talking about next token prediction to generate video frames.

4 comments

Rover222

Yeah, which is pretty slow due to the need to autoregressively generate each image frame token in sequence. And leading diffusion models need to progressively denoise each frame. These are very expensive computationally. Generating the entire world using current techniques is incredibly expensive compared to rendering and rasterizing triangles, which is almost completely parallelized by comparison.

NaomiLehman 4 months ago

in a few years it's possible that this will run locally in real time
Rover222 4 months ago
Okay you clearly know 20x more than me about this, so I cannot logically argue. But the vague hunch remains that this is the future of video games. Within 3 to 4 years.
- rowanG077 4 months ago
  
  I don't think that will ever happen die to extreme hardware requirements. What I do see happen is that only an extremely low fidelity scene is rendered with only basic shapes, no or very little textures etc. that is them filled in by AI. DLSS taken to the extreme, not just resolution but the whole stack.