← Back to context

Comment by ra7

7 hours ago

The novel aspect here seems to be 3D LiDAR output from 2D video using post-training. As far as I'm aware, no other video world models can do this.

IMO, access to DeepMind and Google infra is a hugely understated advantage Waymo has that no other competitor can replicate.

3d from moving 2d images has been a thing for decades.

  • This is 3D LiDAR output (multimodal) from 2D images.

    • LiDAR is the technology used to do spatial capture. The output is just point clouds of surfaces. So they’re generating surface point clouds from video