Comment by eloisius

2 days ago

From a single picture it infers a hidden 3D representation, from which you can produce photorealistic images from slightly different vantage points (novel views).

There's nothing "hidden" about the 3d represenation. It's a point cloud (in meters) with colors, and a guess at the the "camera" that produced it.

(I am oversimplifying).

  • "Hidden" or "latent" in a context like this just means variables that the algo is trying to infer because it doesn't have direct access to them.

  • Hidden in the sense of neural net layers. I mean intermediary representation.

    • Right.

      I just want to emphasize that this is not a NERF where the model magically produces an image from an angle and then you ask "ok but how did you get this?" and it throws up its hands and says "I dunno, I ran some math and I got this image" :D.