← Back to context

Comment by sebastos

1 day ago

Then you deeply underestimate how difficult the problem is, and deeply misunderstand where all the effort has been spent in developing autonomous vehicles.

If all the effort has been spent in trying to replicate the human brain then I am comfortable saying that is a mistake.

We have a tool that can tell with great accuracy how far away an object is. The suggestion that we should ignore it and rely on cameras that have to guess it because “that’s how humans work” is absurd, frankly.

  • Before you can learn how far away an object is, you must decide: which laser return corresponds to which object? In fact, what counts as an object? Where does a tree stop and become a fallen tree branch? Is that object moving towards me? Is the apparent velocity of this point represent the fact that the object is moving, or that it's rotating, or that it's flexing, or dividing, or all 4? Is that object moving towards me but that's ok because it's a car that's going to stay in its lane? What's a lane? What's my laser return for where the lane is? Should I stop at this intersection? What's my laser return for whether the light is red? Am I in the blind spot of the car in front of me? Is he about to shift into my lane because he doesn't see me? What laser return do I get to tell me whether his indicator is on?

    The problem of understanding what is happening in front of you while driving is preposterously more complicated than just a point cloud of distances. That is .01% of the problem. To solve the remaining 99.99%, you need interpretation of photons and sound waves into a semantic understanding that gives you predictive power to guess how the physical world will evolve and avoid breaking the rules of the road. Show me a mechanized way of understanding the causes of how the physical structure of the world is about to evolve, and I'll show you something that is imitating a human brain, however poorly. The cameras give you _plenty_ of data to determine 3D structure, at a higher resolution than the laser, without being emissive, for cheaper. It's a completely reasonable approach to focus your limited computational hardware on interpreting the data you have instead of adding more modalities with their own limitations that (according to nature) are demonstrably unnecessary.

    The world is more complicated than slogans and pitchforks and Elon Bad.