Comment by Leptonmaniac

2 months ago

Can someone ELI5 what this does? I read the abstract and tried to find differences in the provided examples, but I don't understand (and don't see) what the "photorealistic" part is.

26 comments

Leptonmaniac

emsign 2 months ago

Imagine history documentaries where they take an old photo and free objects from the background and move them round giving the illusion of parallax movement. This software does that in less than a second, creating a 3D model that can be accurately moved (or the camera for that matter) in your video editor. It's not new, but this one is fast and "sharp".

Gaussian splashing is pretty awesome.

crazygringo 2 months ago

Oh man. I never thought about how Ken Burns might use that.
Already you sometimes see where manually cut out a foreground person from the background and enlarge them a little bit and create a multi-layer 3D effect, but it's super-primitive and I find it gimmicky.
Bringing actual 3D to old photographs as the camera slowly pans or rotates slightly feels like it could be done really tastefully and well.
kurtis_reed 2 months ago
What are free objects?
- ferriswil 2 months ago
  
  The "free" in this case is a verb. The objects are freed from the background.
  
  8 replies →

ares623 2 months ago

Takes a 2D image and allows you to simulate moving the angle of the camera with correct-ish parallax effect and proper subject isolation (seems to be able to handle multiple subjects in the same scene as well)

I guess this is what they use for the portrait mode effects.

derleyici 2 months ago

It turns a single photo into a rough 3D scene so you can slightly move the camera and see new, realistic views. "Photorealistic" means it preserves real textures and lighting instead of a flat depth effect. Similar behavior can be seen with Apple's Spatial Scene feature in the Photos app: https://files.catbox.moe/93w7rw.mov

carabiner 2 months ago

Black Mirror episode portraying what this could do: https://youtu.be/XJIq_Dy--VA?t=14. If Apple ran SHARP on this photo and compared it to the show, that would be incredible.

Or if you prefer Blade Runner: https://youtu.be/qHepKd38pr0?t=107

diimdeep 2 months ago
One more example from Star Trek Into Darkness https://youtu.be/p7Y4nXTANRQ?t=61
- rasz 2 months ago
  
  I was thinking Enemy of the State (1998) https://www.youtube.com/watch?v=3EwZQddc3kY

zipy124 2 months ago

Basically depth estimation to split the scene into various planes, and then inpainting to work out the areas in the obscured parts of the planes, and then the free movement of them to allow for parallax. Think of 2D side scrolling games that have various different background depths to give illusion of motion and depth.

eloisius 2 months ago

From a single picture it infers a hidden 3D representation, from which you can produce photorealistic images from slightly different vantage points (novel views).

avaer 2 months ago
There's nothing "hidden" about the 3d represenation. It's a point cloud (in meters) with colors, and a guess at the the "camera" that produced it.
(I am oversimplifying).
- uh_uh 2 months ago
  
  "Hidden" or "latent" in a context like this just means variables that the algo is trying to infer because it doesn't have direct access to them.
- eloisius 2 months ago
  
  Hidden in the sense of neural net layers. I mean intermediary representation.
  
  1 reply →

skygazer 2 months ago

Apple does something similar right now in their photos app, generating spatial views from 2d photos, where parallax is visible by moving your phone. This paper’s technique seems to produce them faster. They also use this same tech in their Vision Pro headset to generate unique views per eye, likewise on spatialized images from Photos.

p-e-w 2 months ago

Agreed, this is a terrible presentation. The paper abstract is bordering on word salad, the demo images are meaningless and don’t show any clear difference to the previous SotA, the introduction talks about “nearby” views while the images appear to show zooming in, etc.

avaer 2 months ago

It makes your picture 3D. The "photorealistic" part is "it's better than these other ways".