Comment by Someone
1 day ago
> it uses a world model that evolved over millions of years to model the environment. That's why we can get excellent 3D images from a 2D screen
That doesn’t require millions of years of evolution. We can ‘evolve’ it way faster on computers.
For an example, see https://depth-anything.github.io/.
I also think we don’t need good depth estimation to avoid collisions while walking around. The problem is scale-invariant except for the fact that deceleration is superlinear (doubling your speed more than doubles stopping distance), but at walking speed, that effect isn’t very large.
Decent depth estimation is needed for judging foot placement, but that’s at relatively close range.
At driving speed, that changes, but I think you can still get away with rough estimates.
(I’m not saying one shouldn’t use LiDAR, just arguing that we don’t know whether “LiDAR is necessary” is true. Yes, cameras cannot reproduce all aspects of human vision yet, but they also can surpass many aspects of human vision. Examples are resolution and field of view)
No comments yet
Contribute on Hacker News ↗