← Back to context

Comment by cs702

3 years ago

> I'd sum up your points 1,2,3 as "more data". This would be a reason to think they can one day be ahead if they can take advantage of this, but not evidence that they are currently ahead.

I'd sum up those three points as "more data and more real-world, open-ended, large-scale testing by regular people." Big difference.

> Occupancy networks: waymo has published research on this before Tesla announced this at AI day (not clear to me who got there first though https://arxiv.org/pdf/2203.03875v1.pdf)

AFAIK, Tesla FSD Beta is the only system that has been using these DNNs for open-ended testing.

> Tesla's Dojo -> Waymo has TPUs to train on

I've trained AI models on TPUs. They're nowhere near close to 36x more efficient than Nvidia GPUs.

> I am pretty biased because as a Tesla owner I am pretty pissed off at this point at how the false positives on the system in detecting close following are stopping my safety score from getting high enough to even be able to access the product I purchased.

Oh, I get your frustration... but I also understand why Tesla is being so strict with safety scores at this point. It wouldn't be fair to blame them for that.

TPUs are hard to use outside of Google, (I have tried in and out of Google). I think the situation is improving, but the efficiency from using a large pod is really remarkable. What topology did you train AI models on? Within Google it's common to train across a whole pod or even across multiple pods 8x16x16 is the largest currently.

Also if Tesla actually published numbers on an MlPerf benchmark, I would be more inclined to believe claims about 36x better efficiency.

https://mlcommons.org/en/training-normal-20/

The fastest times I'm seeing here for image classification and for object detection (not the same, but probably closest proxy out of the tasks benchmarked) are for TPUs.

To know who has better training technology I don't think you should be using a cost-efficiency metric, it seems to me the best thing to use would be who can train networks the fastest. Cost metrics are easy to game especially if you are the ones making the chips (Of course them making chips is cheaper than buying Nvidia chips for them once the capital investment is made). To measure who is ahead in technology, I think you have to look at who can train models the fastest, and right now as far as I can tell, TPUs are unbeat for this. (Although practically speaking it's hard to pull off these large topology things externally and there are also other caveats with ML perf related to how the training setups are optimized, but nonetheless, it's a better signal than what Elon says in a presentation :) )

And as far as safety scores, my point is that the safety score is calculated incorrectly because of obvious false positives with "close following". I'm talking being nowhere near a car, getting an alert that says I'm following too close, and that dropping my safety score. I understand why the bar is high, but at this point I honestly suspect there is some tomfoolery going on with how that score is calculated.

You think it's actually "safety score" and not Tesla protecting their brand by restricting who gets to demonstrate the system?

Maybe you should consider that when watching YouTube videos of people using it....