Comment by meander_water
8 hours ago
This looks like it uses Gemini Nano under the hood. But the latest Gemma4 E2B and E4B models appear to be much better, so you'd probably be better off deploying quantized versions through an extension for now.
- Gemini Nano-1: 46% MMLU, 1.8B
- Gemini Nano-2: 56% MMLU, 3.25B
- Gemma4 E2B: 60.0% MMLU, 2.3B
- Gemma4 E4B: 69.4% MMLU, 4.5B
Sources:
- https://huggingface.co/google/gemma-4-E2B-it
- https://android-developers.googleblog.com/2024/10/gemini-nan...
I no longer have any inside knowledge, but from my time on this team they were very quick about getting the latest small (Google) models into Chrome. I expect that if Gemma 4 (or its equivalent Gemini Nano) isn't already in Chrome, then it will be soon.
Note that the article here was last updated 2025-09-21, and as of that time it was already on Gemini Nano 3.
Thanks for the insider info! Do you know if there are any published benchmarks for Nano 3?
Google will soon release Gemini Nano 4 based on Gemma 4. A "Fast" version based on Gemma 4 E2B and a "Full" version based on E4B.
https://android-developers.googleblog.com/2026/04/AI-Core-De...
> This looks like it uses Gemini Nano under the hood.
Yes; "With the Prompt API, you can send natural language requests to Gemini Nano in the browser."
The Prompt API uses the model that's available in your browser. For Edge I believe it's Phi4.