Comment by scriptsmith
1 day ago
If Chrome has the #optimization-guide-on-device-model and #prompt-api-for-gemini-nano flags enabled, either because it's part of some Origin Trial / Early Stable Release or something, then web pages will have access to the new Prompt API which allows any webpage to initiate the (one-time) download of the ~2.7 GiB CPU or ~4.0 GiB GPU model using LanguageModel.create()
https://developer.chrome.com/docs/ai/prompt-api
When Chrome 148 releases tomorrow, this will be the default behaviour on desktop.
To download, it should check for 22 GiB free disk space on the volume where your Chrome data dir is, and at least double the model size of free space in your tmp dir.
First the tabs came for the RAM and i did not protest, for i had plenty. Then they came for the chip and i did not protest, for it was dark silcon anyway. Then they came for the HDD.
And then they made the ram and ssd so expensive :)
I am curious if it reuses the LLM across all tabs, hard to imagine most machines can boot up 1-2 of any 4gb model unless its a more powerful system.
I think it obviously will, what would be the benefit to spinning up more than one copy?
1 reply →
Okay, but the browser is basically the computer for most people.
Told ya.
The more severe problem is that Google installs model weight files on a per-user basis, meaning Chrome occupies 4 more GB of space for every OS user on your device.
The company I work at has several environments and hundreds of VDI users in each environment. Chrome is the default browser in all of them. By my rough napkin math, this one small change by Google will eat up at least 15 terabytes of new disk space in total. (I sure hope we are using deduplication at the physical storage layer...)
It's fine. Network and disk space are free, right?
2 replies →
Shouldn't the filesystem be set to encrypt everything before it hits the physical storage layer?
Thankfully deduplication is a thing ;)
I certainly hope you don't automatically update.
1 reply →
For every profile.
Does each playwright (or similar automation system) count as a different user, and does it keep the model around ?
If yes, it's an interesting API to call when a AI crawler hit your website.
4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks (and in the future even PhD level intelligence) for free?
Oh, the horror!!!
Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse
It doesn't make sense to apply wholesale prices for mass storage. People are running Chrome on specific devices that they already own. Storage is not fungible in this way.
If you’re pissed you had to pay your HVAC guy to drive to your house and do something you think is trivial, why didn’t you do it yourself?
1 reply →
> 4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks for free?
This is better than my current solution of an actual human with masters degreed intelligence performing all my cognitive tasks for free how? I mean, i'm the first to admit i'm extremely lazy and even i'm over here like "really??"
> Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse
Right, because its totally something an LLM can do, right?
Here is your google brain on your device, whether you want it or not.
I don’t think you understand what “free” means
Tell that to Apple, I'm sure they will allow me to pay $0.025/GB for additional storage on my Macbook /s
2 replies →
You can already trigger a 2 GB model download with the Summarizer API[0], which is already shipped in Chrome.
[0]: https://developer.chrome.com/docs/ai/summarizer-api#model-do...
I think this is a distinct model from the Prompt API, since the other shipped AI APIs use fine tuned models.
Both of them say they use Gemini Nano.
So now we're up to 6 GB
Per user
The problem is that some of us are still on connections that charge per GB in rural areas. Here in Montana it's very common to pay about $0.25 per GB regardless of how much you use, so this is a $1 additional cost per desktop device. Places like public school districts have hundreds of computers and this will be somewhat significant for them.
I was thinking a similar thing. Many of our customers have purpose use computers that rarely see physical infrastructure internet, but need a modern browser (many chose Chrome on their own, we never recommended it).
They're going to get blasted with cellular data charges when they fire up their computer in the field.
Google's updater service also currently ignores the windows 11 metered connection hint. It will gladly download that model over your cell connection even if you have a data cap.
This is infuriating behavior.
Silicon Valley must wake up and understand the entire world does not live like them.
It is a small model, so what utility can I / Google expect from it? What is the on-board model used for?
It's not a very good small model to be honest.
That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.
Its a good idea to run small models locally if your computer can host them for privacy and cash saving reasons. But how can you trust Google to autoinstall one on your machine in 2026? I just couldn't do it.
Sure, local models good and yes, there's no way we can trust Google.
We can be positive the entire motivation of Chrome is user behavior surveillance. There's not a nano-chance in all the multiverses that Chrome model is doing anything privately. They've gone to extraordinary length to accomplish this. It's not for free.
11 replies →
> But how can you trust Google to autoinstall one on your machine
Why are AI models something I'd be uniquely unable to trust Google to install, compared all the other code included in Chrome updates? Is your point just that you shouldn't trust Chrome in general?
7 replies →
All that matters is some MBA product manager at Google was celebrated for shipping this. Hooray!
4 replies →
> That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.
Really? I'm a total amateur when it comes to doing anything with local models but I tried a few in this range using ollama at this point, and they didn't seem to know much about anything, and I couldn't figure out how to get them to search the web or run other tools, so that was where the experiment ended.
A small local model that can use bash would be a bit of a game-changer for me.
3 replies →
Which is why I uninstalled Chrome a (short...) while ago and my life went on unbothered.
2 replies →
Half of the reason to use local AI is to circumvent the censorship that Google, OpenAI and so on have. I don't want this Google crap on my computer.
It's based on Gemma 3n, and it's not the best.
I find it works fine for simple classification, translation, interpretation of images & audio. It can write longer prose, but it's pretty bad.
It can also write text in the format of a JSON schema or regexp for anything you might want to do with structured data.
I wonder why they’re using Gemma 3 and not Gemma 4?
3 replies →
I find models of this size (not tested this one specifically) at being very good at simple data extraction from user input. Think about things like parsing date and time of an event from a description or parsing a human-typed description of a repeating event rule.
this is considered a large model. i think you might be surprised how many "small" models chrome has already pulled down on your disk.
but to answer your question: one of the services that uses a small model: PermissionsAIv4
""" Use the Permission Predictions Service and the AIv4 model to surface permission notification requests using a quieter UI when the likelihood of the user granting the permission is predicted to be low. Requires `Make Searches and Browsing Better` to be enabled. – Mac, Windows, Linux, ChromeOS, Android """
[dead]
I ran a fairly large production test of this and on _every_ measure except for privacy it was worse than a free tier server hosted LLM.
Not happy about that as I would like to see more local models but that's the current state of things.
https://sendcheckit.com/blog/ai-powered-subject-line-alterna...
> on _every_ measure except for privacy it was worse than a free tier server hosted LLM
Would you be able to compare this to other local models in it's class and a above that would fit consumer-grade hardware?
> It is a small model, so what utility can I / Google expect from it?
Precedence for shipping models alongside consumer software.
Potentially without consent if it truly is a silent install.
Something to do with serving more ads. My guess is they will use this to “better target” or to drain more information from you for their ads.
Those two (and more) exist in chrome://flags in Chrome 147. I'm disabling them now, with the expectation that will prevent the new default.
One option I'm leaving as default is "Use LiteRT-LM runtime for on-device model service inference." Any comment on that?
I'm on Chrome 147 too and disabled:
"optimization-guide-on-device-model"
- Enables optimization guide on device
"prompt-api-for-gemini-nano"
- Prompt API for Gemini Nano
- Prompt API for Gemini Nano with Multimodal Input
and deleted weights.bin and the 2025.x folder in "OptGuideOnDeviceModel"
Will report if Chrome 148 downloads the model again.
If you touch those files into existence and chown to root and chmod to 0, it shouldn’t be able to ever overwrite them right?
3 replies →
maybe I was on the wrong side of the early release but I’ve deleted this model many times in the last year. I’ve had it for at least 12 months.
thanks, went to flags in Vivaldi and just in case disabled all flags containing "gemini" and first five results for "model"
Those flags will exist already, but will default to enabled in 148.
That other flag is for using a different open-source inference engine to the (from what I can tell) closed-source one that's used by default.
[dead]
Searching about:flags for model comes up with a whole bunch:
#omnibox-ml-url-scoring-model
#omnibox-on-device-tail-suggestions
#optimization-guide-on-device-model
#text-safety-classifier
#prompt-api-for-gemini-nano
#writer-api-for-gemini-nano
#rewriter-api-for-gemini-nano
#proofreader-api-for-gemini-nano
#summarizer-api-for-gemini-nano
#on-device-model-litert-lm-backend
Then around gemini but not caught by the search for models: #skills (maybe? I think this is implied by "gemini in chrome"?)
edit: I don't see a carte blanch AI disabling option. As much as I dislike Mozilla's growing obsession with AI, at least they give me a top level option to disable all AI stuff. I only keep Chrome around for occasional testing reasons.
So my understanding of that is that the download happens only when sites call the Prompt API right?
Because my Chrome stable has been updated to v148 now, and I don't see any AI models in my user profile folder. My profile size is only 328 MB, with the Code Cache subfolder occupying the most space (135 MB).
In my understanding, yes. I wrote a blog post about some of the internals here: https://news.ycombinator.com/item?id=48028662
I wrote a more detailed blog post here:
https://news.ycombinator.com/item?id=48028662
Next step: Invoke the prompt API from within online ads and run a "p2p" AI inference provider which forwards incoming LLM queries to website visitors. :-)
This sounds perfectly reasonable. No objection from me.
Do you know if also Chromium has thesenfkags enabled?
Depends on where you get it. By default the flags will be enabled, but some packagers may choose to disable them. I haven't seen a major distro release chromium 148 yet.
Weirdly though, chromium won't be able to actually use the model even though it can download it, because the inference engine is a closed-source blob.
https://adsm.dev/posts/prompt-api/#which-browsers-support-th...
I believe webpages that use the API must request from the user via a system permissions dialogue to aces the prompt API, according the docs a few months ago.
It can only be called after the user has interacted with the page, but there's no dialogue from the browser
https://developer.chrome.com/docs/ai/get-started#user-activa...
[dead]