Comment by __alexs
7 hours ago
> you should theoretically be able to take the output of the Lidar + Cameras model and use it as training data for a Camera only model.
Why should you be able to do that exactly? Human vision is frequently tricked by it's lack of depth data.
"Exactly" is impossible: there are multiple Lidar samples that would map to the same camera sample. But what training would do is build a model that could infer the most likely Lidar representation from a camera representation. There would still be cases where the most likely Lidar for a camera input isn't a useful/good representation of reality, e.g. a scene with very high dynamic range.