Comment by sashank_1509

2 months ago

Recently I had a chance to listen to a set of talks powering Waymo Technology. I think the average academic roboticist will be shocked by the complete lack of end to end deep learning models or even large models powering Waymo. It’s interesting to me that the only working self driving car on the market right now, basically has painstakingly listed every possible road obstacle, has coded every possible driving logic to it, and manually addressed every edge case. maybe Tesla’s end to end approach will work, and that will be the way moving forward, but the real world seems to provide an almost limitless amount of edge cases that neural networks don’t seem great at handling. In fact the winning approach to humanoids, if Waymo is proven to be the right approach might be listing every possible item a humanoid can see an environment, detecting them and then planning for them.

10 comments

sashank_1509

boulos 2 months ago

(Disclosure: I work for Waymo)

While there is plenty of classical robotics code in our planner, I wouldn't want people to assume that we don't use neural networks for planning.

Just because we don't deploy end-to-end models (e.g., sensors to controls), but have separate perception and planning components doesn't mean there isn't ML in each part. Having the components separate means we can train and update each individually, test them individually, inject overrides as needed, and so on. On the flip side, it's true that because it's not learned end-to-end today that there might exist a vastly simpler or higher quality system.

So we do a lot of research in this area, like EMMA (https://waymo.com/research/emma/) but don't assume that our planning isn't heavily ML based. A lot of our progress in the last couple of years has been driven by increasing the amount of ML used for planning, especially for behavior prediction (e.g., https://waymo.com/research/wayformer/)

marcosdumay 2 months ago

> basically has painstakingly listed every possible road obstacle, has coded every possible driving logic to it, and ... addressed every edge case

Removed that "manually" world so now it describes exactly what you would have to do to train an end to end neural network.

NNs don't get information from nothing, you would have to subject them to the exact same obstacles, geometries and behaviors you coded on the manual version.

tuatoru 2 months ago

https://en.wikipedia.org/wiki/Siphonaptera_(poem)

Big edge cases have little edge cases that require their own code / and those edge cases have smaller edge cases with yet more code.

My shorthand is "the real world is a fractal of edge cases".

Zigurd 2 months ago

Not every edge case, but enough that the vehicle can correctly determine it doesn't know how to proceed and must ask a human to choose from among a menu of choices. This is how Waymo described how supervision works. Nobody actually drives the vehicle remotely. They just make a decision the on-board intelligence has decided it can't make.

One good bet based on Waymo's decision to expand is that the amount of supervision each robotaxi needs keeps going down, so supervision is not tightly coupled to fleet size.

bethekidyouwant 2 months ago

A menu of choices operated with the operators WASD keys

huevosabio 2 months ago

I think it would be the other way around, academic roboticists are very well aware of how damn hard the physical world is.

AIPedant 2 months ago

IMO the most relevant point is that, even with all that data, Waymos are backed up by a large team of humans that can help guide the cars through "difficult" common-sense situations that AI is simply not capable of handling, both because artificial neural networks are very primitive and stupid compared to vertebrate brains, and because it is practically impossible to collect enough data for the dumb ANNs to learn from. Self-driving companies are very cagey about this.

sho_hn 2 months ago

I suppose the (crummy) analog is that a human's "models" are equally not entirely general; we have evolved a particular architecture that is baked into our hardware and perpetuated via our DNA.

It's fuzzy and plastic and complex, but the brain has functional areas, there is intelligence more local to specific sensors, pipelines where fusion happens, governors and supervisors, specific numeric limits to certain tasks, etc.

This is a bit akin to your "listing every possible item", in a way, in the sense that there are definitely finite structures tuned toward the application of being human.

This interplay via our supposed "AGI" and what is "cached" in our also not static but evolving hardware is really one of the most fascinating aspects of biology.

exe34 2 months ago

"Any sufficiently big bag of tricks is indistinguishable from true intelligence."

egbantan 2 months ago

Link to talks?