Uber wants to turn its drivers into a sensor grid for self-driving companies

1 day ago (techcrunch.com)

> “The bottleneck is data.”

This seems to be wishful thinking on the part of Uber, and also Tesla. Google StreetView data is probably sufficient. Waymo's expansion into new cities does not seem to be delayed much by the need for more data.

Most of the reported problems with self-driving come from transient situations. More mapping data will not help with those.

China has the Beijing High-level Autonomous Driving Demonstration Zone, where traffic cams and other sensors let vehicles see beyond their own sightlines.[1] That's been going on since 2020. That's the ultimate in sensing - full real time road info.

The Beijing test area is getting some expansion. The new direction seems to be to focus on airports and railroad stations, so that driverless cars can be aware of congestion in detail. That makes sense.

[1] https://sinocities.substack.com/p/inside-chinas-connected-ve...

  • One question I am genuinely wondering about is whether a self-driving car _is_ cheaper than a human driver, once all of the externalities are priced in. In SF right now a Waymo is typically priced a little under an Uber (actually quite a bit under if you count that no one has got around to asking for tips for AIs yet). I am sure the running costs of each Waymo vastly exceed the costs of a human driver to Uber...

    • Good question. We know that Waymo has about one remote operator/customer service rep per 40 vehicles. Waymo also operates vehicle garages for charging, cleaning, and maintenance. Those probably all add up to roughly what it costs a rental car company to operate a car. Maybe more, because there's more complex maintenance, maybe less, because they park themselves.

      There's also a huge sunk R&D cost and an ongoing R&D cost that probably dwarfs operating costs. But the per-car cost drops as more cars are deployed.

      On the other hand, robot vehicles can have higher utilization than single-owner vehicles. They can be on the road as long as there are customers. Observation of their parking lots indicates most of the cars are out on the road about 12 hours a day.

      5 replies →

  • Context is super important.

    Unlike web services that giants like google provide (e.g search), waymo and other AVs essentially cannot fail. Like at all. It is suspectible to ‘randomness’ of nature that can be the difference between life and death.

    A lot of so called ‘smart’ people are going to find themselves getting humbled by the real world.

    Humans are able to make sense of the world around them through things like intuition. Machines do not possess this characteristic.

  • > Google StreetView data is probably sufficient.

    This is an extraordinary claim. Self driving cars just need 15 ft grid panoramic images that are months or years stale? What experience are you basing this claim on?

> The insight driving the program, Naga said, is that the limiting factor for AV development is no longer the underlying technology. “The bottleneck is data,” he said. “[Companies like Waymo] need to go around and collect the data, collect different scenarios. You may be able to say: in San Francisco, ‘At this school intersection, I want some data at this time of day so I can train my models.’ The problem for all these companies is access to that data, because they don’t have the capital to deploy the cars and go collect all this information.”

You can’t be the CTO of Uber wanting to do AVs, and get the data collection requirement shockingly wrong.

Waymo’s bottleneck has never been data. When they want data about a school intersection in SF at a certain time of day, they just... synthetically generate it and simulate: https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-f...

Waymo is able to deploy with less (but targeted and high quality) data collection by having world class simulation capabilities. Not that they haven't collected huge amounts of data as it's no doubt important (I've heard their onboard storage is transferred and emptied every few days), it's just not a bottleneck. They have the most efficient operation in the AV industry.

The best example of why data collection isn’t the bottleneck is Tesla. They boast about billions of miles of data, yet they’re struggling to put out fully autonomous vehicles.

  • > When they want data about a school intersection in SF at a certain time of day, they just... synthetically generate it and simulate

    I think it's more about detecting changes to the world. You need boots on the ground, so to speak, to see that new speed limit sign or the new lane paint. The Waymo vehicle can no doubt react to changes in the world when it encounters them, relaying them back to the mothership, but it's better to know about them in advance.

    • Most AVs, definitely Waymo vehicles, are self mapping. They can detect environment changes and relay it to the entire fleet. That's because they map using the same vehicles as the fleet.

    • >You need boots on the ground, so to speak, to see that new speed limit sign or the new lane paint.

      It'll shock you to know that you can simply get this from governments, some even provide this in API form

      8 replies →

  • Yeah I'm not so sure this CTO is on the mark here, but to be fair, I do think some of this IRL long tail/edge case data is important for Waymo. The simulation software is super interesting to me - the real world can be so chaotic, and even if they could generate every possible real life case, there needs to be validation on whether the Waymo driver is responding in the optimal way. They certainly haven't solved this problem, you can see some of their growing pains in all of these articles - floods in Austin, more and more interactions with emergency vehicles that first responders seem to believe are getting worse, etc.

    Tesla on the other hand has billions of miles of data, yet because there is a limit to camera-only techniques, that data isn't that useful is it? They have no ground truth data to evaluate their camera system on, which is why sometimes you see those Teslas driving around with lidar rigs mounted on them. Going camera-only is just asking for trouble.

    • I agree real world data is important for Waymo. I didn't mean to say it wasn't, so I've edited my comment to reflect that. It's just that data is not some magic bullet to achieve self driving like Tesla and others suggest.

      Of course, Waymo still has much more room for improvement. But it's much more efficient to supplement less but higher quality IRL data with large amounts of synthetic data, than to run a million data collection vehicles 24x7 because most IRL data is boring and useless.

      Waymo said 6 years ago they simulate 20 million miles every single day [1]. Clearly, it's working for them given their scale of deployment right now.

      [1] https://waymo.com/blog/2020/04/off-road-but-not-offline--sim...

      1 reply →

  • > The best example of why data collection isn’t the bottleneck is Tesla.

    Exactly. plus any delivery company/dashcam company can provide a bunch of data where ever there is any sizeable population.

    About 8 years ago, that data would have been really valuable, but at best its nice to have.

    the only thing that is valuable is the breadth of different cars, but even then its not that much of a differentiator.

  • The biggest difference, is Uber has vehicles around the world. So there's more data from countries with different rules from the US. Signage is definitely different between the US and Europe.

  • I.. am amused by the confidence on display, but I can't say that I am not concerned that people are confidently stating that real world data is not useful, because it can be just simulated. One would think that, by now at least, we know that simulation is at best an imperfect copy.

    And I don't like the idea of even more data being harvested and used.. I just find the dismissal.. odd.

  • > The best example of why data collection isn’t the bottleneck is Tesla. They boast about billions of miles of data, yet they’re struggling to put out fully autonomous vehicles.

    Well, TBF, the tesla data was complete garbage with earlier vehicles. They had cheap and somewhat bad cameras in the earlier vehicles that was only somewhat recently updated. And even then, I don't think Tesla is at the end of their hardware journey. I think they don't think that either, which is why they've gone to a subscription only model for self driving vehicles.

    Waymo, on the other hand, has gathered less data, but more high quality data. They do the expensive mapping of a city which is a big part of why their vehicles have early on been able to do some pretty impressive feats. The drawback is getting that high quality data takes a lot of time and resources.

    • > And even then, I don't think Tesla is at the end of their hardware journey.

      I dunno about that. Tesla seems completely adrift, pretending to pivot with random forays into humanoid robotics or whatever, to the point that I wouldn't be surprised if they exited the consumer vehicle space altogether within the next decade. They have no answer for Chinese competitors.

      2 replies →

  • Didn't they need the data from the 200 million miles or so from actual driving before they could get to the generative model though? Data isn't everything, as you point out with Telsa (mainly because they decided to forego using lidar it would seem), but it is pretty fundamental.

  • "You can’t be the CTO of Uber wanting to do AVs, and get the data collection requirement shockingly wrong."

    Problem 1: Cost and privacy constrain limit data collection.

    Problem 2: It makes not much sense to collect and store data that you already have. Yet you don't know that when collecting if it is useful or not.

    Problem 3: P2P in urban setting fails at edge cases which by definition are rare to collect.

    All of these problems limit AV scaling.

  • Waymo might very well be missing specific kinds of data (e.g more incidents/accidents, near-collisions etc)

    Also, Uber’s data might be useful for eval, not training (e.g « here is how Waymo would behave vs human drivers therefore it is safer »)

    • > Waymo might very well be missing specific kinds of data (e.g more incidents/accidents, near-collisions etc)

      Accidents and near-collisions are exactly the kind of scenarios perfect for simulation. You don't test them out in the real world and risk injuries/deaths. You need to have confidence they're handled before you deploy.

      1 reply →

  • I find the idea of learning from simulated data so unintuitive. How can you radically improve your model with just your model? I take it people do it, so it must work, but i just don’t understand it at all.

    • Well there's a world simulation model and then the driving model.

      You can imagine improving i.e. a specialized math model (problem in, theorem out) with a normal LLM that knows lots of problems and theorems generally.

    • I think people are skipping over the fact that Google has had cars driving around taking photos for 20 years. I imagine that was used to build the world model in the first place.

    • They're two different models - you can use the world model to train (or test like Wayve) a different car-driving model.

      The world model is basically intended as a more true-to-life simulator.

I feel like they should have done this 6 years ago. Most AV companies already have tons of their own data today. But how would it work to install expensive LIDAR sensors on privately-owned vehicles?

  • FWIW, a large fraction of Uber drivers aren't actually driving their own personal cars, at least around me nowadays. They're either rented or some sort of fleet vehicle (complete with TCP #)

  • Exactly my take as well. This would have been the right diversification move a decade ago.

    Uber did invest early in self driving back in 2015, but in 2018 there was a fatality which pretty much deleted their whole program. And looks like it's taken them way too long to try picking it back up.

  • I was working at Lyft 8 years ago and suggested this to the head of AV program then. They didn't listen.

  • Most AV companies already have tons of their own data today.

    Real-world data spoils faster than a gas station banana.

    If your AV company is relying on data from six years ago, you're going to kill someone.

"Our goal is not to make money out of this data" is doing a lot of heavy lifting for a company that just committed $10B to robotaxis and is taking equity in the same AV companies it would be supplying. The actually interesting part isn't the sensor grid, that's years away and has real consent, compensation, and regulatory problems nobody's talking about. It's shadow mode: letting AV companies sim-run their models against millions of real Uber trips without putting a car on the road. That works today. That's the product. The sensor grid is the press release. Shadow mode is the business. Also completely absent from this article: do the drivers whose cars become "rolling data collection platforms" get anything? A cut? A notification? A commemorative badge? Uber has a rich history of finding creative ways to extract value from its driver network, so I'm sure they've thought carefully about this.

I'm old. Was anyone else's reaction to wonder what Uber was doing for audio-video companies?

The original title says "self-driving" and that's much more clear.

How useful are these generic sensor inputs for AVs? Like, how much more valuable is a Waymo’s data for a Waymo than something Uber collects?

I remember Travis Kalanik spouting the talking points about self-driving in 2017, that after Waymo, Tesla had the advantage because they had the best data, that they were going to crack self-driving soon. Then I remember Dara scuttling Uber’s entire self driving division in 2019.

Self-driving is possible but it requires a massive sustained investment in custom hardware on the car, in real and simulation testing, in painstaking software developlment covering tens of thousands of scenarios, realtime remote control failsafes, fleet management capabilities in every city. Waymo is the only company that comes close to the right approach. All these other Elons, GM, Uber CEOs are just jangling shiny objects in front of investors. A moonshot on the financial model for what are otherwise mature stagnant businesses.

Isn’t this a pivot I always thought Uber wanted to automate their whole fleet instead of having to pay people ?

  • Uber was working on self driving ten years ago. They had cars on the road loaded with cameras and sensors specifically to collect data. Then they negligently killed a woman crossing a street.

    This isn't a pivot, this is them trying to sheepishly reenter the race they were dramatically ejected from.

    • The main reason Uber sold their self driving R&D unit was because they couldn't afford it. So they sold it to another company taking a 25% stake in Aurora and Uber CEO joined their board, the company is still operating automated trucking https://en.wikipedia.org/wiki/Aurora_Innovation?wprov=sfti1

      They run trucks for Fedex in Texas and wants to offer an "Uber Freight network"

  • This exactly; self driving was a large part of their valuation iirc.

    • The part I don't fully understand is what leverage this gives Uber over anyone else? Uber doesn't have the fleet management, mechanics, cleaners or even storage facilities. They do have the most used taxi app, but that seems like a very small edge.

      There's nothing stopping the car makers from running their own taxi service and they already have networks of mechanics and cleaner as well as some level of storage. They'd need to scale up, but they don't need to start from zero.

      Ubers success is in large part build on not having to own AND MANAGE their cars. With self-driving cars that advantage disappears, unless they're gaming that "drivers" will buy the cars and lease it to them.

I'm honestly surprised that Tesla never took advantage of all the cameras in all its cars to do some kind of mapping project. I always thought that was incredibly valuable data. Sort of an automatically crowd-sourced street view.

Correct. Smart move. Had Inhad the option to do the same for my car, I wouldn’t mind doing it (provided that I be compensated for it).

Interesting way to encourage competition for its competitor. A single, scaled self-driving company is a massive threat to Uber.

Will they pay the drivers for hosting the sensors?

Can the drivers charge a monthly late for hosting the sensors?

  • If Waymo is going to license the software, it can be tremendously useful, in particular given the variety of uber vehicles

Uber wants to turn its drivers into a sensor grid for AV companies

Seems par for the course. Nintendo turned legions of Pokemon Go players into unpaid sensor grids for delivery robots.

I asked an Uber driver, formerly a taxi driver in LA, how he felt about the fact that his driving data was being used to build his replacement.

He said he “didn’t care and besides what was he going to do about it anyway, it’s going to happen no matter what”

I asked if he had ever heard of collective bargaining or knew about unions and he said no.

I think we’re only about another generation before the only purpose for human labor is to train and check the outputs of a machine.

  • I'm not too worried about it. Yeah, it's bad that people don't understand how labor organizing works. It's bad they're not willing to stand up to shitty employers and take a little risk to make life better. But in this particular case the fear is totally illusory. It's just another silicon valley conman selling some warped "dream" that probably won't actually materialize[1]. "Autonomous Vehicles" are nowhere near production ready, and they're not going to be any time soon. Wake me up when a serious truck or car manufacturer starts rolling them out en masse, then I might start to get worried about it. Until then, it's just about the same category as flying cars--sure, we have these hexacopter contraptions which can (barely) lift a single person for 20min. Not interesting.

    [1] Here's how you know:

      “Our goal is not to make money out of this data,” Naga said. “We want to democratize it.”

    • > Wake me up when a serious truck or car manufacturer starts rolling them out en masse, then I might start to get worried about it

      I think enough people haven’t been in a Waymo to realise that the technology is basically here, and that we’re like 10 to 20 years of doubling away from AVs doing tens of millions of trips a day in America. By the time anyone has invested in true mass production of AVs, we’ll already be so far down that path that the policy deck will be dealt.

      6 replies →

  • People don’t understand the slow motion horror movie that this is becoming. Labor demand begets population growth for all of human history. Demand conditions set the stage for population growth. Labor surplus set the stage for population declines. Again, this has been true for all of human history.

    So what are we walking into? Not 8-11 billion happy cows. A crisis. People deciding not to reproduce. The human population declining. The irony as we achieve a technical pinnacle while justifying our own extinction by choice. The great filter as it turns out is actually capitalism, a race to business efficiency against all else including the incentives of your very own species. This is the mind virus.

    • Preface: I am personally NOT into anti-growth ideas, and I also think it’s super alarming that the West especially seems to be intent on wiping itself out by lack of having kids.

      But that said, supposing we are looking at 60 years from now having a few billion fewer people on Earth, just by attrition (lack of replacement) that is not automatically bad. We could afford to shrink in population - if there’s a floor to that contraction. If indeed there are way too many people in a decade for the available human jobs, then it could be the equilibrium is just a lower population. Which could be temporary - who knows what the future could bring, such as possible space colonization, which may need more humans and also give people the hope that I think Gen Y and Z have lacked, which is one reason for their low repro rates.

      1 reply →

    • "People deciding not to reproduce."

      The complete destruction of the human through exploitation and control, as seen in the article, is a major reason people are too unhappy to start families.

      The worst part? Most people don't even know why, so there's never a general public reaction to fix it.

    • this whole argument depends on the supposition that if brith rates ever drops below replacement rates, then that inexorably implies the extinction of the species. whether or not now is a good time, at some point growth has to stop. and there are plenty of conceivable social arrangement that are perfectly workable with a constant population size.

      the only real argument for continued growth in to preserve the current structure of investment. that's your great filter, and it will result in economic collapse which isn't the same as extinction.

    • There’s no “becoming”

      It’s here and it’s been here for decades - it’s just finally impossible to ignore or wave away

      Gig workers are self-chattelizing because there is no floor to the depravity that society will accept, and an endless supply of people who will chattelize themselves for a moment of pleasure

      17 replies →

  • > I asked if he had ever heard of collective bargaining or knew about unions and he said no.

    collective bargaining or unions do not prevent technological progress, but merely retard it in the hopes that their members can benefit at the cost of progress for everyone else. Look at dock workers and how they tried to prevent automation with unions.

    • Dock workers are really the best example. We should have been automating at least a decade ago. I don’t know why folks would think unions or collective bargaining should be used to prevent automation. You will just lose on the medium to long term.

      Reminds me of when dockworkers resisted the shift to cargo containers. Those ports ultimately lost business in the end.

So once again the employees should bring the data to replace them

  • Isn't Uber also replacing itself? If you use your human drivers to train other companies' robot taxis, aren't you gonna ruin both the human driver service and the data collection service?

    • Shareholders won't like the plan of limiting growth. That's a really interesting point, what's the plan for when human drivers can no longer generate new data and they've already sold they what they've managed to collect to all the car markers who want to buy?

      It sounds like a terrible business plan.