← Back to context

Comment by KaiserPro

2 days ago

> Tesla are producing cyber cabs now which are 10th the price of Waymo's and can drive autonomously anywhere in the world.

Wait what? when did they actually enter mass production?

> I mean humans have Lidar sensors

Real time slam is actually pretty good, the hard part is reliable object detection using just vision. Tesla's forward facing cameras are effectively monocular, which means that its much much harder to get depth (its not impossible but moving objects are much more difficult to observe if you only have cameras aligned on the same plane with no real parallax)

Ultimately Musk is right, you probably don't need lidar to drive safely. but its far more simple and easier to do if you have Lidar. Its also safer. Musk said "lidars are a crutch", not because he is some sort of genius, Its obvious that SLAM only driving is the way forward since the mid 00's (of not earlier). The reason he said it is because he thought he could save money not having lidar. The problem for him is that he didn't do the research to see how far away proper machine perception is to account for the last 1% in accuracy needed to make vision only safe and reliable.

> Tesla's forward facing cameras are effectively monocular

Notably, human perception is effectively monocular in driving situations at distances of 60 feet or farther. It's best in the area where your limbs can reach.

We don't need stereoscopic vision to drive.

  • "precise" stereo vision is 30m, but the limit of depth perception is around 200m (some people are 500m)

    crucially we have excellent implied depth, and object detection, something that even non-realtime state of the art tracking doesn't have.

    human depth is much more complex than just parallax, which some poeple use as an argument that "pure vision" monocular depth is possible to do robustly. It will be, but not for a while. Especially as depth is only part of the problem. object categorisation is the other.

> Wait what? when did they actually enter mass production?

"mass" is a strong word but the first one came off their production line 5 days ago

ramp to high volume will probably be extremely slow

Not mass production yet, but the first one rolled off the completed assembly line at giga texas last week

Sensor fusion is not far simpler, when the sensors disagree, and they will often, you have to pick which to trust.

It is amazing to see how many people here are confident they know the one true way to build autonomous systems based on nothing but wanting to confirm their biases

  • > Sensor fusion is not far simpler,

    Sensor fusion is a piece of piss. How do you think any of the VR headsets work? SLAM is compute expensive, and runs at ~30hz, sensor fusion is the only way to give a smooth experience.

    Online calibration is actually the hard part, even then its largely solved.