Comment by scriptsmith

1 day ago

If Chrome has the #optimization-guide-on-device-model and #prompt-api-for-gemini-nano flags enabled, either because it's part of some Origin Trial / Early Stable Release or something, then web pages will have access to the new Prompt API which allows any webpage to initiate the (one-time) download of the ~2.7 GiB CPU or ~4.0 GiB GPU model using LanguageModel.create()

https://developer.chrome.com/docs/ai/prompt-api

When Chrome 148 releases tomorrow, this will be the default behaviour on desktop.

To download, it should check for 22 GiB free disk space on the volume where your Chrome data dir is, and at least double the model size of free space in your tmp dir.

105 comments

scriptsmith

21asdffdsa12 1 day ago

First the tabs came for the RAM and i did not protest, for i had plenty. Then they came for the chip and i did not protest, for it was dark silcon anyway. Then they came for the HDD.

oaiey 17 hours ago

And then they made the ram and ssd so expensive :)
bearjaws 17 hours ago
I am curious if it reuses the LLM across all tabs, hard to imagine most machines can boot up 1-2 of any 4gb model unless its a more powerful system.
- nsvd2 16 hours ago
  
  I think it obviously will, what would be the benefit to spinning up more than one copy?
  
  1 reply →
doctorpangloss 19 hours ago

Okay, but the browser is basically the computer for most people.
underlipton 1 day ago

Told ya.

maxloh 1 day ago

The more severe problem is that Google installs model weight files on a per-user basis, meaning Chrome occupies 4 more GB of space for every OS user on your device.

bityard 1 day ago
The company I work at has several environments and hundreds of VDI users in each environment. Chrome is the default browser in all of them. By my rough napkin math, this one small change by Google will eat up at least 15 terabytes of new disk space in total. (I sure hope we are using deduplication at the physical storage layer...)
- throwway120385 1 day ago
  
  It's fine. Network and disk space are free, right?
  
  2 replies →
- tbrownaw 13 hours ago
  
  Shouldn't the filesystem be set to encrypt everything before it hits the physical storage layer?
- dburkland 17 hours ago
  
  Thankfully deduplication is a thing ;)
- Pay08 21 hours ago
  
  I certainly hope you don't automatically update.
  
  1 reply →
emegeve83 24 minutes ago

For every profile.
BiteCode_dev 19 hours ago

Does each playwright (or similar automation system) count as a different user, and does it keep the model around ?
If yes, it's an interesting API to call when a AI crawler hit your website.
ai-x 21 hours ago
4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks (and in the future even PhD level intelligence) for free?
Oh, the horror!!!
Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse
- recursive 20 hours ago
  
  It doesn't make sense to apply wholesale prices for mass storage. People are running Chrome on specific devices that they already own. Storage is not fungible in this way.
- beedeebeedee 20 hours ago
  
  If you’re pissed you had to pay your HVAC guy to drive to your house and do something you think is trivial, why didn’t you do it yourself?
  
  1 reply →
- froggit 16 hours ago
  
  > 4GB, $0.10 (whatever the HD price) that is the equivalent of a High School level intelligent brain that can perform many cognitive tasks for free?
  This is better than my current solution of an actual human with masters degreed intelligence performing all my cognitive tasks for free how? I mean, i'm the first to admit i'm extremely lazy and even i'm over here like "really??"
- hansmayer 18 hours ago
  
  > Wait, let me pay my HVAC guy $500 he deserved because he came all the way from his home to replace a fuse
  Right, because its totally something an LLM can do, right?
- verisimi 19 hours ago
  
  Here is your google brain on your device, whether you want it or not.
- mcmcmc 19 hours ago
  
  I don’t think you understand what “free” means
- SkiFire13 19 hours ago
  
  Tell that to Apple, I'm sure they will allow me to pay $0.025/GB for additional storage on my Macbook /s
  
  2 replies →

sheept 1 day ago

You can already trigger a 2 GB model download with the Summarizer API[0], which is already shipped in Chrome.

    Summarizer.create()

[0]: https://developer.chrome.com/docs/ai/summarizer-api#model-do...

I think this is a distinct model from the Prompt API, since the other shipped AI APIs use fine tuned models.

rafram 19 hours ago
Both of them say they use Gemini Nano.
crumpled 20 hours ago
So now we're up to 6 GB
- entropicdrifter 18 hours ago
  
  Per user

ddtaylor 17 hours ago

The problem is that some of us are still on connections that charge per GB in rural areas. Here in Montana it's very common to pay about $0.25 per GB regardless of how much you use, so this is a $1 additional cost per desktop device. Places like public school districts have hundreds of computers and this will be somewhat significant for them.

marklubi 10 hours ago

I was thinking a similar thing. Many of our customers have purpose use computers that rarely see physical infrastructure internet, but need a modern browser (many chose Chrome on their own, we never recommended it).
They're going to get blasted with cellular data charges when they fire up their computer in the field.
McGlockenshire 14 hours ago

Google's updater service also currently ignores the windows 11 metered connection hint. It will gladly download that model over your cell connection even if you have a data cap.
This is infuriating behavior.
Silicon Valley must wake up and understand the entire world does not live like them.

wuschel 1 day ago

It is a small model, so what utility can I / Google expect from it? What is the on-board model used for?

2ndorderthought 1 day ago
It's not a very good small model to be honest.
That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.
Its a good idea to run small models locally if your computer can host them for privacy and cash saving reasons. But how can you trust Google to autoinstall one on your machine in 2026? I just couldn't do it.
- imglorp 1 day ago
  
  Sure, local models good and yes, there's no way we can trust Google.
  We can be positive the entire motivation of Chrome is user behavior surveillance. There's not a nano-chance in all the multiverses that Chrome model is doing anything privately. They've gone to extraordinary length to accomplish this. It's not for free.
  
  11 replies →
- Ajedi32 1 day ago
  
  > But how can you trust Google to autoinstall one on your machine
  Why are AI models something I'd be uniquely unable to trust Google to install, compared all the other code included in Chrome updates? Is your point just that you shouldn't trust Chrome in general?
  
  7 replies →
- wildrhythms 1 day ago
  
  All that matters is some MBA product manager at Google was celebrated for shipping this. Hooray!
  
  4 replies →
- safety1st 20 hours ago
  
  > That said, you might be surprised to learn that some of the models from 3b-9b could probably replace 80% of the things nonvibe coders use chatgpt for.
  Really? I'm a total amateur when it comes to doing anything with local models but I tried a few in this range using ollama at this point, and they didn't seem to know much about anything, and I couldn't figure out how to get them to search the web or run other tools, so that was where the experiment ended.
  A small local model that can use bash would be a bit of a game-changer for me.
  
  3 replies →
- soco 1 day ago
  
  Which is why I uninstalled Chrome a (short...) while ago and my life went on unbothered.
  
  2 replies →
- tsss 1 day ago
  
  Half of the reason to use local AI is to circumvent the censorship that Google, OpenAI and so on have. I don't want this Google crap on my computer.
scriptsmith 1 day ago
It's based on Gemma 3n, and it's not the best.
I find it works fine for simple classification, translation, interpretation of images & audio. It can write longer prose, but it's pretty bad.
It can also write text in the format of a JSON schema or regexp for anything you might want to do with structured data.
- Wowfunhappy 1 day ago
  
  I wonder why they’re using Gemma 3 and not Gemma 4?
  
  3 replies →
kevincox 1 day ago

I find models of this size (not tested this one specifically) at being very good at simple data extraction from user input. Think about things like parsing date and time of an event from a description or parsing a human-typed description of a repeating event rule.
rmac 20 hours ago
this is considered a large model. i think you might be surprised how many "small" models chrome has already pulled down on your disk.
but to answer your question: one of the services that uses a small model: PermissionsAIv4
""" Use the Permission Predictions Service and the AIv4 model to surface permission notification requests using a quieter UI when the likelihood of the user granting the permission is predicted to be low. Requires `Make Searches and Browsing Better` to be enabled. – Mac, Windows, Linux, ChromeOS, Android """
- nemomarx 20 hours ago
  
  [dead]
michaelbuckbee 1 day ago
I ran a fairly large production test of this and on _every_ measure except for privacy it was worse than a free tier server hosted LLM.
Not happy about that as I would like to see more local models but that's the current state of things.
https://sendcheckit.com/blog/ai-powered-subject-line-alterna...
- gchamonlive 1 day ago
  
  > on _every_ measure except for privacy it was worse than a free tier server hosted LLM
  Would you be able to compare this to other local models in it's class and a above that would fit consumer-grade hardware?
accrual 1 day ago

> It is a small model, so what utility can I / Google expect from it?
Precedence for shipping models alongside consumer software.
Potentially without consent if it truly is a silent install.
hightrix 21 hours ago

Something to do with serving more ads. My guess is they will use this to “better target” or to drain more information from you for their ads.

tobylane 1 day ago

Those two (and more) exist in chrome://flags in Chrome 147. I'm disabling them now, with the expectation that will prevent the new default.

One option I'm leaving as default is "Use LiteRT-LM runtime for on-device model service inference." Any comment on that?

RaiausderDose 1 day ago
I'm on Chrome 147 too and disabled:
"optimization-guide-on-device-model"
- Enables optimization guide on device
"prompt-api-for-gemini-nano"
- Prompt API for Gemini Nano
- Prompt API for Gemini Nano with Multimodal Input
and deleted weights.bin and the 2025.x folder in "OptGuideOnDeviceModel"
Will report if Chrome 148 downloads the model again.
- phs318u 1 day ago
  
  If you touch those files into existence and chown to root and chmod to 0, it shouldn’t be able to ever overwrite them right?
  
  3 replies →
- beaugunderson 1 day ago
  
  maybe I was on the wrong side of the early release but I’ve deleted this model many times in the last year. I’ve had it for at least 12 months.
- Markoff 1 day ago
  
  thanks, went to flags in Vivaldi and just in case disabled all flags containing "gemini" and first five results for "model"
scriptsmith 1 day ago

Those flags will exist already, but will default to enabled in 148.
That other flag is for using a different open-source inference engine to the (from what I can tell) closed-source one that's used by default.
dpoloncsak 1 day ago

[dead]

Twirrim 15 hours ago

Searching about:flags for model comes up with a whole bunch:

#omnibox-ml-url-scoring-model

#omnibox-on-device-tail-suggestions

#optimization-guide-on-device-model

#text-safety-classifier

#prompt-api-for-gemini-nano

#writer-api-for-gemini-nano

#rewriter-api-for-gemini-nano

#proofreader-api-for-gemini-nano

#summarizer-api-for-gemini-nano

#on-device-model-litert-lm-backend

Then around gemini but not caught by the search for models: #skills (maybe? I think this is implied by "gemini in chrome"?)

edit: I don't see a carte blanch AI disabling option. As much as I dislike Mozilla's growing obsession with AI, at least they give me a top level option to disable all AI stuff. I only keep Chrome around for occasional testing reasons.

d3Xt3r 16 hours ago

So my understanding of that is that the download happens only when sites call the Prompt API right?

Because my Chrome stable has been updated to v148 now, and I don't see any AI models in my user profile folder. My profile size is only 328 MB, with the Code Cache subfolder occupying the most space (135 MB).

scriptsmith 16 hours ago

In my understanding, yes. I wrote a blog post about some of the internals here: https://news.ycombinator.com/item?id=48028662

scriptsmith 9 hours ago

I wrote a more detailed blog post here:

https://news.ycombinator.com/item?id=48028662

codethief 18 hours ago

Next step: Invoke the prompt API from within online ads and run a "p2p" AI inference provider which forwards incoming LLM queries to website visitors. :-)

jimmaswell 20 hours ago

This sounds perfectly reasonable. No objection from me.

madduci 5 hours ago

Do you know if also Chromium has thesenfkags enabled?

scriptsmith 5 hours ago

Depends on where you get it. By default the flags will be enabled, but some packagers may choose to disable them. I haven't seen a major distro release chromium 148 yet.
Weirdly though, chromium won't be able to actually use the model even though it can download it, because the inference engine is a closed-source blob.
https://adsm.dev/posts/prompt-api/#which-browsers-support-th...

jadbox 13 hours ago

I believe webpages that use the API must request from the user via a system permissions dialogue to aces the prompt API, according the docs a few months ago.

scriptsmith 13 hours ago

It can only be called after the user has interacted with the page, but there's no dialogue from the browser
https://developer.chrome.com/docs/ai/get-started#user-activa...

BergAndCo 20 hours ago

[dead]