Comment by mattlondon
6 hours ago
Suddenly all this focus on world models by Deep mind starts to make sense. I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.
Google/Alphabet are so vertically integrated for AI when you think about it. Compare what they're doing - their own power generation , their own silicon, their own data centers, search Gmail YouTube Gemini workspace wallet, billions and billions of Android and Chromebook users, their ads everywhere, their browser everywhere, waymo, probably buy back Boston dynamics soon enough (they're recently partnered together), fusion research, drugs discovery.... and then look at ChatGPT's chatbot or grok's porn. Pales in comparison.
Google has been doing more R&D and internal deployment of AI and less trying to sell it as a product. IMHO that difference in focus makes a huge difference. I used to think their early work on self-driving cars was primarily to support Street View in thier maps.
There was a point in time when basically every well known AI researcher worked at Google. They have been at the forefront of AI research and investing heavily for longer than anybody.
It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.
But they are in full gear now that there is real competition, and it’ll be cool to see what they release over the next few years.
I also think the presence of Sergey Brin has been making a difference in this.
12 replies →
> It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.
I always thought they deliberately tried to contain the genie in the bottle as long as they could
1 reply →
It has always felt to me that the LLM chatbots were a surprise to Google, not LLMs, or machine learning in general.
Not true at all. I interacted with Meena[1] while I was there, and the publication was almost three years before the release of ChatGPT. It was an unsettling experience, felt very science fiction.
[1]: https://research.google/blog/towards-a-conversational-agent-...
3 replies →
Google and OpenAI are both taking very big gambles with AI, with an eye towards 2036 not 2026. As are many others, but them in particular.
It'll be interesting to see which pays off and which becomes Quibi
> Suddenly all this focus on world models by Deep mind starts to make sense
Google's been thinking about world models since at least 2018: https://arxiv.org/abs/1803.10122
FWIW I understood GP to mean that it suddenly makes sense to them, not that there’s been a sudden focus shift at google.
Tesla built something like this for FSD training, they presented many years ago. I never understood why they did productize it. It would have made a brilliant Maps alternative, which country automatically update from Tesla cars on the road. Could live update with speed cameras and road conditions. Like many things they've fallen behind
No Lidar anymore on the 2026 Volvo models ES60 and EX60. See for example: https://www.jalopnik.com/2032555/volvo-ends-luminar-lidar-20...
I love Volvo, am considering buying one in a couple weeks actually, but they're doing nothing interesting in terms of ADAS, as far as I can tell. It seems like they're limited to adaptive cruise control and lane keeping, both of which have been solved problems for more than a decade.
It sounds like they removed Lidar due to supplier issues and availability, not because they're trying to build self-driving cars and have determined they don't need it anymore.
4 replies →
Without Lidar + the terrible quality of tesla onboard cameras.. street view would look terrible. The biggest L of elon's career is the weird commitment to no-lidar. If you've ever driven a Tesla, it gives daily messages "the left side camera is blocked" etc.. cameras+weather don't mix either.
At first I gave him the benefit of the doubt, like that weird decision of Steve Jobs banning Adobe Flash, which ran most of the fun parts of the Internet back then, that ended up spreading HTML5. Now I just think he refused LIDAR on purely aesthetic reasons. The cost is not even that significant compared to the overall cost of a Tesla.
16 replies →
Yeah its absurd. As a Tesla driver, I have to say the autopilot model really does feel like what someone who's never driven a car before thinks it's like.
Using vision only is so ignorant of what driving is all about: sound, vibration, vision, heat, cold...these are all clues on road condition. If the car isn't feeling all these things as part of the model, you're handicapping it. In a brilliant way Lidar is the missing piece of information a car needs without relying on multiple sensors, it's probably superior to what a human can do, where as vision only is clearly inferior.
29 replies →
I have HW3, but FSD reliably disengages at this time of year with sunrise and sunset during commute hours.
2 replies →
>>The biggest L of elon's career is the weird commitment to no-lidar.
I thought it was the Nazi salutes on stage and backing neo-nazi groups everywhere around the world, but you know, I guess the lidar thing too.
From the perspective of viewing FSD as an engineering problem that needs solving I tend to think Elon is on to something with the camera-only approach – although I would agree the current hardware has problems with weather, etc.
The issue with lidar is that many of the difficult edge-cases of FSD are all visible-light vision problems. Lidar might be able to tell you there's a car up front, but it can't tell you that the car has it's hazard lights on and a flat tire. Lidar might see a human shaped thing in the road, but it cannot tell whether it's a mannequin leaning against a bin or a human about to cross the road.
Lidar gets you most of the way there when it comes to spatial awareness on the road, but you need cameras for most of the edge-cases because cameras provide the color data needed to understand the world.
You could never have FSD with just lidar, but you could have FSD with just cameras if you can overcome all of the hardware and software challenges with accurate 3D perception.
Given Lidar adds cost and complexity, and most edge cases in FSD are camera problems, I think camera-only probably helps to force engineers to focus their efforts in the right place rather than hitting bottlenecks from over depending on Lidar data. This isn't an argument for camera-only FSD, but from Tesla's perspective it does down costs and allows them to continue to produce appealing cars – which is obviously important if you're coming at FSD from the perspective of an auto marker trying to sell cars.
Finally, adding lidar as a redundancy once you've "solved" FSD with cameras isn't impossible. I personally suspect Tesla will eventually do this with their robotaxis.
That said, I have no real experience with self-driving cars. I've only worked on vision problems and while lidar is great if you need to measure distances and not hit things, it's the wrong tool if you need to comprehend the world around you.
7 replies →
I always understood this to be why Tesla started working on humanoid robots
Pretty much. They banked on "if we can solve FSD, we can partially solve humanoid robot autonomy, because both are robots operating in poorly structured real world environments".
I don't want a humanoid robot. I want a purpose built robot.
Obviously both will exist and compete with each other on the margins. The thing to appreciate is that our physical world is already built like an API for adult humans. Swinging doors, stairs, cupboards, benchtops. If you want a robot to traverse the space and be useful for more than one task, the humanoid form makes sense.
The key question is whether general purpose robots can outcompete on sheer economies of scale alone.
They started working on humanoid robots because Musk always has to have the next moonshot, trillion-dollar idea to promise "in 3 years" to keep the stock price high.
As soon as Waymo's massive robotaxi lead became undeniable, he pivoted to from robotaxis to humanoid robots.
Yeah, that and running Grok on a trillion GPUs in space lol
It's so they can stick a Tesla logo on a bunch of chinese tech and call it innovation.
So for the record, with this realization you're 3+ years behind Tesla.
https://www.youtube.com/watch?v=ODSJsviD_SU&t=3594s
Aren't they still using safety drivers or safety follow cars and in fewer cities? Seems Tesla is pretty far behind.
What do you think I said that you're contradicting?
IMO the presence of safety chase vehicles is just a sensible "as low as reasonably achievable" measure during the early rollout. I'm not sure that can (fairly) be used as a point against them.
I'm comfortably with Tesla sparing no expense for safety, since I think we all (including Tesla) understand that this isn't the ultimate implementation. In fact, I think it would be a scandal if Tesla failed to do exactly that.
Damned if you do and damned if you don't, apparently.
3 replies →
Internal firewalls and poor management means that the vast majority of integration opportunities are missed.
Which is why it's embarrassing how much worse Gemini is at searching the web for grounding information, and how incredibly bad gemini cli is.
Not my experience in either of those areas.
The flywheel is starting to spin......
Grok/xAI is a joke at this point. A true money pit without any hopes for a serious revenue stream.
They should be bought by a rocket company. Then they would stand a chance.
The vertical integration argument should apply to Grok. They have Tesla driving data (probably much more data than Waymo), Twitter data, plus Tesla/SpaceX manufacturing data. When/if Optimus starts on the production line, they'll have that data too. You could argue they haven't figured out how to take advantage of it, but the potential is definitely there.
Agreed. Should they achieve Google level integration, we will all make sure they are featured in our commentary. Their true potential is surely just around the corner...
"Tesla has more data than Waymo" is some of the lamest cope ever. Tesla does not have more video than Google! That's crazy! People who repeat this are crazy! If there was a massive flow of video from Tesla cars to Tesla HQ that would have observable side effects.
What an upsetting comment. I'm glad you came around but what did you think was going to be effective before you came around to world models?
>or grok's porn
I know it’s gross, but I would not discount this. Remember why Blu-ray won over HDDVD? I know it won for many other technical reasons, but I think there are a few historical examples of sexual content being a big competitive advantage.
It's a 3500lb robot that can kill you.
Boston Robotics is working on a smaller robot that can kill you.
Anduril is working on even smaller robots that can kill you.
The future sucks.
and they're all controlled by (poorly compensated) humans anyway [1] [2]
[1] https://www.wsj.com/tech/personal-tech/i-tried-the-robot-tha...
[2] https://futurism.com/advanced-transport/waymos-controlled-wo...
They couldn't even make burger flipping robots work and are paying fast food workers $20/hr in California.
If that doesn't make it obvious what they can and cannot do then I can't respect the tranche of "hackers" who blindly cheer on this unchecked corporate dystopian nightmare.
"Waymo as a robot in the same way"
Erm, a dishwasher, washing machine, automated vacuum can be considered robots. Im confused as to this obsession of the term - there are many robots that already exist. Robotics have been involved in the production of cars for decades.
......
I think the (gray) line is the degree of autonomy. My washing machine makes very small, predictable decisions, while a Waymo has to manage uncertainty most of the time.
Its irrelevant. A robot is a robot.
Dictionary def: "a machine controlled by a computer that is used to perform jobs automatically."
4 replies →
But somehow google fails to execute. Gemini is useless for programming and I don’t think even bother to use it as chat app. Claude code + gpt 5.2 xhigh for coding and gpt as chat app are really the only ones that are worth it(price and time wise)
I've recently switched to Claude for chat. GPT 5.2 feels very engagement-maxxed for me, like I'm reading a bad LinkedIn post. Claude does a tiny bit of this too, but an order of magnitude less in my experience. I never thought I'd switch from ChatGPT, but there is only so much "here's the brutal truth, it's not x it's y" I can take.
GPT likes to argue, and most of its arguments are straw man arguments, usually conflating priors. It's ... exhausting; akin to arguing on the internet. (What am I even saying, here!?) Claude's a lot less of that. I don't know if tracks discussion/conversation better; but, for damn sure, it's got way less verbal diarrhea than GPT.
1 reply →
To me ChatGPT seems smarter and knows more. That’s why I use it. Even Claude rates gpt better for knowledge answers. Not sure if that itself is any indication. Claude seems superficial unless you hammer it to generate a good answer.
Experiencing the same. It seems Anthropic’s human-focused design choices are becoming a differentiator.
Gemini works well enough in Search and in Meet. And it's baked into the products so it's dead simple to use.
I don't think Google is targeting developers with their AI, they are targeting their product's users.
Gemini is by far the best UI/UX designer model. Codex seems to the worst: it'll build something awkward and ugly, then Gemini will take 30-60 seconds to make it look like something that would have won a design award a couple years ago.
It is a bit mind boggling how behind they were considering they invented transformers and were also sitting on the best set of training data in the world, but they've caught up quite a bit. They still lag behind in coding, but I've found Gemini to be pretty good at more general knowledge tasks. Flash 3 in particular is much better than anything of comparable price and speed from OpenAI or Anthropic.