Comment by jeroenhd
2 hours ago
Google's Gemini features differ per region to a massive extent. There's a good chance privacy laws prevent Google from providing me with the same Gemini you use.
Object detection is mediocre at best. Circling things and using their AI editing features works, but the artefacts confuse Lens and other image parsing systems. Extracting objects from images usually mostly works, but it's not on par with what Apple had long before Google built it.
The difference remains that the Gemini app on Android requires activation. You cannot tap a button or click a link while you're on the Gemini screen.
The video isn't on the linked page anymore, but it's here: https://deepmind.google/blog/ai-pointer/ and here: https://www.youtube.com/watch?v=pZNzfQLgGsA
It's an absolute privacy nightmare for most people, but if we ever get enough RAM and compute to run this stuff locally, I think this can actually make a new paradigm for user interaction, something with lisp machine self-customisability but for people who don't know anything about computers.
And if it doesn't work, it'll be the most horrific, messy, useless UI humanity has ever invented, and we all get a new funny meme to laugh about Google. Win-win!
No comments yet
Contribute on Hacker News ↗