Comment by K0balt

17 days ago

It absolutely can be pointed to any standard endpoint, either cloud or local.

It’s far better for most users to be able to specify an inference server (even on localhost in some cases) because the ecosystem of specialized inference servers and models is a constantly evolving target.

If you write this kind of software, you will not only be reinventing the wheel but also probably disadvantaging your users if you try to integrate your own inference engine instead of focusing on your agentic tooling. Ollama, vllm, hugging face, and others are devoting their focus to the servers, there is no reason to sacrifice the front end tooling effort to duplicate their work.

Besides that, most users will not be able to run the better models on their daily driver, and will have a separate machine for inference or be running inference in private or rented cloud, or even over public API.

5 comments

K0balt

backscratches 17 days ago

It is not local first. Local is not the primary use case. The name is misleading to the point I almost didn't click because I do not run local models.

K0balt 17 days ago
I think the author is using local-first as in “your files stay local, and the framework is compatible with on-prem infra”. Aside from not storing your docs and data with a cloud service though, it’s very usable with cloud inference providers, so I can see your point.
Maybe the author should have specified that capability, even though it seems redundant, since local-first implies local capability but also cloud compatibility, or it would be local or local-only.
- backscratches 17 days ago
  
  It's called "LocalGPT". It's a bad name.
  
  2 replies →