← Back to context

Comment by WarmWash

17 hours ago

The actual breakthrough with Genie is being able to turn around and look back, and seeing the same scene that was there before. A few other labs have similar world simulators, but they all struggle badly with keeping coherence of things not in view. Hence why they always walk forwards and never look around.

Still amazed it took ML people so long to realize they needed and explicit representation to cache stuff.

  • Genie does not use an explicit representation:

    >Genie 3’s consistency is an emergent capability. Other methods such as NeRFs and Gaussian Splatting also allow consistent navigable 3D environments, but depend on the provision of an explicit 3D representation. By contrast, worlds generated by Genie 3 are far more dynamic and rich because they’re created frame by frame based on the world description and actions by the user.

What about Fei Fei Li's lab? I think they are generating true 3D worlds rather than frames of a video?

Although that probably precludes her from having animations in those worlds...