Comment by maxloh
3 days ago
In my experience with Gemini, most of its capabilities stem from web searching instead of something it has already "learned." Even if you could obtain the model weights and run them locally, the quality of the output would likely drop significantly without that live data.
To really have local LLMs become "good enough for 99% of use cases," we are essentially dependent on Google's blessing to provide APIs for our local models. I don't think they have any interest in doing so.
I agree 100%. Often when I use increasingly powerful local models (qwen3.5:32b I love you) I mix in web search using search APIs from Brave, Perplexity, and DuckDuckGo summaries. Of course this requires that I use local models via small Python or Lisp scripts I write. I pay for the Lumo+ private chat service and it has excellent integrated search, like Gemini or ChatGPT.
EDIT: I have also experimented with creating a local search index for the common tech web sites I get information from - this is a pain in the ass to maintain, but offers very low latency to add search context for local model use. This is most useful with very small and fast local models so the whole experience is low latency.
Interesting idea on the local search index! It occurs to me that running something that passively saves down content that I browse and things that AI turns up while it does its own searches, plus a little agent to curate/expand/enrich/update the index could be super handy. I imagine once it had docs on the stuff I use most frequently that even a small model would feel quite smart.
yeah i really like this idea too, I don't need the entire internet indexed I only need the stuff i'm interested in indexed. I can imagine like a small agent i can task with "find out as much as you can about <subject>" and what it does is search the web, download the content, and index it for later retrieval. Then I can add a skill for the main agent to search the knowledge base if needed. Kind of like a rag pipeline but using agents to build a curated data source of stuff i'm interested in.
Nice idea, caching what you are already browsing.
That's totally not my experience. The AI component (as opposed to the knowledge component) is really what makes these models useful, and you could add search as a tool. Of course for that you'll be dependent on a search provider, that's true.
You don't get the AI component without the knowledge component. The AI needs approximate knowledge of lots of things to conceptualize what you're talking about and use search tools effectively.
The set of things it needs approximate knowledge over grows slowly but noticeably over time.
But the point is that at a certain amount of neurons your AI will not get appreciably smarter, just more knowledgeable (and more costly). At least to the majority of users this will be true. The knowledge part can then be outsourced to search engines, to make it cheaper.
1 reply →
This is actually so ironic. Corporations spent fortunes to design cool websites, but what people really want is structured, easy to read information in the context they want.
So flow is you type search query to Gemini, Gemini uses Google search, scans few results, go to selected websites, see if there is anything relevant and then compose it into something structured, readable and easy to ingest.
It's almost like going back to 90s browsing through forums, but this time Gemini is generating equivalent of forum posts "on the fly".
a long time ago ( in AI time ) Karpathy used the analogy that LLMs were like compression algorithms. I can see that now when i ask an LLM a question it's basically giving me back the whole internet compressed to the scope of my question.
Unless you can provide a (community) curated list of sources to search through (e.g. using MCP). Then I think local models may become really competitive.