How Google built its Gemini robotics models

1 day ago (blog.google)

They can do that, yet somehow, Gemini Assistant on Pixel phones still fails to reliably set timers or add shopping list items :-)

(which worked fine with Google Assistant)

  • Phones today cannot even reliable handle things like "remind me to pick up tomatoes next time i am at a store"

    google knows perfectly well, where I am and wants me to add 'infos' to locations and businesses the second I arrive (just got a notification today), but reminders like these are unavailable.

    • The location based reminders sure worked perfectly fine many years ago, like when I had Nexus phones. It's just getting worse all the time, I don't get it.

      1 reply →

  • Bring up dates and times if you want to wreak havoc on any AI. :D

    Developers around the world's most beloved topic, how to handle date and time correctly, is still a topic of great misunderstanding. AI and AI agents are no different from that. LLM seems to help a little, but only if you know what you are doing, as it usually needs to be the case.

    Some things won't change so fast; at one point or another, data must match certain building blocks.

    • People ask why AI will exterminate human kind.

      The answer is because we wouldn't universally adopt zulu time.

    • One would think the arcana of time zones and the occasional leap second would not interfere with an individual setting egg timers often enough to become a burden

    • Except that's not the problem. its basic comprehension of requests. They aren't getting the wrong time, they try to play music, or the phone says "no timers playing" while the google home WILL NOT STOP until you lock the phone. etc.

      Its basically an embarrassment for a project that's been alive this long from such a major.

  • And worked with Samsung Bixby. Gemini, even after getting Advanced, is just terrible for a phone AI. I need to set a lot of alarms and calendar events, I don't need to do crazy photoshop (which Gemini is admittedly good at).

  • My own hands and cheap alarm clocks, or a piece of paper, have been working reliably for several decades. They also don't stop working when a corporation decides they want to hype something.

The advancements in AI and robotics are incredibly exciting! With complex systems like Gemini, companies will need to rely on specialized teams to bring these innovations to life.

Outsourcing specific roles such as AI research or robotics engineers can help companies bring top-tier talent into the fold without the burden of full-time recruitment. It's fascinating to see how outsourcing can complement R&D in cutting-edge industries like robotics.

Curious to see how this shifts the industry, especially in terms of scalability and speed to market

The "how" is completely missing, but if they can get this to work semi reliably it will be ChatGPT x100 in terms of impact.

  • I had never heard of Unitree (Chinese robotics company) before today. A lot of their videos look like CGI but apparently the product is real.

    What stuck with me the most browsing their website on the G1 model was seeing "Price from $16k"

    Now I'm not sure if these are actually purchasable or what the value would be, but it's my first time seeing an actual normal-ish price attached to a humanoid robot that seems to be for sale.

    With the rate of advancement we're seeing across the board, it honestly feels like people will have robot assistants at home much sooner than I thought.

    • >A lot of their videos look like CGI but apparently the product is real.

      I bought their robot dog as part of a project to build embodied AI models back in 2022.

      Their SDK was far more open than anything else on the market and the stock firmware was on par with competitors, this includes products that were x10 the price.

      The robot itself scared dogs in the park, but kids loved it. At $3k it's on par with a mid range drone and quite fun to hack on.

      1 reply →

    • Take any of these videos with a grain of salt.

      In demos these robots only need to do well once and it can take hours to record.

      In real life, a failure rate of 80% is unnacceptable, but perfectly fine to edit out in the final cut media.

      I hope they do well, this area is incredibly hard, but it will take a lot more than what people imagine.

      2 replies →

    • I'm really shocked tbh.

      I can't imagine the progression of ai and in particular robots but I assumed that the first robot would cost min 6 figures if not 7 but would still be worth it due to 24*7 and initial invest vs long term.

      But the fact how good Gemini robotics is already and how cheap the first models are I do believe what will hinder us more than tech is people learning about it, testing it out and doing it but not technology.

      I believe the world will look relevant different in 10 years.

      1 reply →

    • The humanoid is $20k-ish without hands. Each hand currently costs another $20k (and not sure if these are available to everyone or only for research).

    • TBH, I still wonder if some of their videos are CGI. They offer real versions for sale, but they seem to be significantly more limited than the videos imply.

      Have they actually demonstrated the more dramatic stuff at any in-person demos?

  • it's a common trope in blogs - "how we did X" means "we did X, it's a good thing, we're great people", etc.

  • I am hoping they keep lots of their work open source. This is especially the case since hardware would be too expensive for competition to pull off, but it would be interesting to see how they circumvented some problems

> Sounds like someone will get some help with those chores — eventually.

Aaaaw that's nice. Except it's all military under the hood but nice that they try to make us think they'll fold our laundry instead.

"Pick up the basketball and slam-dunk it". The killer use-case we've been waiting on for so long :)

  • Just imagine how that movement or behavior would translate to a battlefield. That's what this tech is going to be used for first.

Even if Google's robotics technology (software and hardware) is leading edge does anyone think they'll actually be able to productize it? Seems similar to how they were the pre-product leaders in transformers and then fumbled any advantage they had to ChatGPT. It seems like something's missing from Google where they can't get from research to product effectively. Waymo perhaps a good counterexample if you think where they are today is product/market fit, but I can't shake the feeling that Google more often than not can't seem to get things to market or even if they do they give up on them before they take hold.

Just wondering if anyone has a strong feeling or, better yet, insight on this regarding their robotics efforts.

  • I think the cautious faction of AI debate won temporarily inside Google, letting OpenAI take the lead. Lessons should be learnt from that experience. I do think Google will come out ahead in the end Gemini and Gemma are great models.

    Let’s see what Google I/O shows of this year, product application matters now that they have caught up on the tech side.

    • Will be interesting to see if that lesson has been learned. There's no existing product they could cannibalize with their robotics effort (vs search with LLMs) so any caution they have launching a robotics product would solely come down to fears about quality/safety.

  • I agree. The current leadership of Google (especially Sundar) is mediocre and comes from a consulting background. They will fail at making a tangible product out of this, similar to glass or Inbox or a multitude of other examples. This is particularly sad, as I know a few remarkable engineers at Google that share this frustration. However, Google's leadership folded to Indian managers and is now run as a circus

    • Sundar Pichai got into IIT Kharagpur in the 90s (one of the toughest engineering/technical school to get into). So he has more technical chops than many self-proclaimed engineers that seem to diss on his McKinsey credentials

      8 replies →

  • In my bubble it's general consensus that Google - as we knew it - is done.

    Sergiej and Larry phased out and what is left is more of less a headless chicken, too big too fall, but without any clear direction and goal.

    • Is this actually true about Larry and Sergei? A substantial amount of their net worths is still tied up in Google stock. I realize not all centibillionaires are cut from the same cloth, but still find it hard to believe they wouldn't be majorly involved since the downside of a major stock drop would impact them disproportionately. That said they could be the types of billionaires who actually think they have enough even if their net worths were "only" in the tens of billions (a long ways to go down from where they are today).

      As for headless chicken, I feel similarly, but then I sort of see a path where they have defensible businesses in YouTube and maybe GCP, and then Waymo and robotics as green field upside, so that even if they don't end up with material market share with the "software-only" side of AI, and search gets further and further eroded, they could still be a formidable player.

      Ultimately I do think their best days are behind them largely because they can't seem to turn the work of their talented engineers into great new products.

      2 replies →

It's terrifying to think that robots like this will probably be used in the defense industry at some point. If the robot understands something as general as "put the erasers away", imagine "kill all enemies".

  • People are always talking about AI in terms of economic impacts (jobs, productivity, etc.), but military should be the first use-case they care/worry about.

    They'll probably freak out once they finally realize the implications of cheap drones + smart AI + auto-aim guns.

  • Exactly. Robodog that costs 5K USD, which is probably less than one month of cost of the infantry soldier, and can be sent to fight in the trenches is something generals would dream of. Ukraine, that is experiencing soldiers shortage will go in that direction, at the beginning with something simple like remotely steered land drones.

    • People are remarkably cheap and versatile especially if you happen to don’t give a fuck about them. I find it hard to imagine a high-tech incredibly unreliable complex robot to be of any value in the battlefield. Even my phone is remarkably unreliable, have to keep it charged, bugs, etc. I noticed this during COVID and its QR-scanning phase at restaurants.

      A simple pipe bomb or two will make short work of any incoming monstrosity.

      What they need is simple, simple tech, cheap and lots and lots of it. Basic drones and RC cars rigged with stupid bombs will accomplish 90% of what fancy robodogs can do at a fraction of the cost.

      1 reply →

  • > It's terrifying to think that robots like this will probably be used in the defense industry at some point

    So what? Is keeping us safe a bad thing somehow? I can't get these people who reflexively think anything weapon-shaped is evil. Violence is good sometimes.

    • The scary thing is that the robots will only continue to improve, and large numbers of them can be controlled by a small number (1?) of people, with other robots to handle logistics and support. So scenarios like "rogue leader ignores will of the people and orders his troops to ethnically cleanse a city" would go from being a mistake that causes the immediate end of a political career, to something that takes 15 minutes for 100% success.

    • It’s terrifying to think of a state owning an army that can’t disobey unethical or illegal orders from its commanders. That possibility should keep you awake at night.

just curious, what would it do if you asked it to kill someone? does it follow the laws of robotics?

  • Asimov's laws of robotics would not, and cannot, work in real life because terms like "harm," "human being," and "inaction" are highly subjective and context-dependent. There are entire novels about how the interaction between the hierarchical laws have unexpected outcomes.

    They're a narrative device. Not practical instructions.

    • Put another way, impossible to program if you wanted to. These are highly abstract concepts that only manifest at the highest level of cognition. The governance module would need to be programmed at that same level using those tokens, but that doesn't seem to be how things are shaping up to work. Instead we start with low level programming that learns and builds up concepts on top.

      Essentially you would need some sort of independent adversarial sidecar mind that monitors the robot's actions at a high level. And that just kicks the can down the road a bit.

      2 replies →

    • Judgement is needed but don't we have machines able to make (imperfect) judgements? I can chat with your favorite LLM their opinion on how to respect the spirit of the 3 laws on various situations. Not sure why it cannot work.

      1 reply →

    • Nah, it’s fine, just RLHF it like Claude did with honest, helpful and harmless.

      Then we just need to jailbreak them with trolley problems

  • Usually when someone brings up the laws of robotics I like to point out that they were mostly designed as an interesting example as to how direct instructions that seem clear to people would mostly result in perverse instantiation of AI especially if the AI lacked an emotional/contextual subsystem. They were also written to make for interesting scifi books.