Comment by refulgentis

3 days ago

[flagged]

We are following Ollama's design, but not verbatim due to apps being sandboxed.

Phones are resource-constrained, we saw significant battery overhead with in-process HTTP listeners so we stuck with simple stateful isolates in Flutter and exploring standalone server app others can talk to for React.

For model sharing with the current setup:

iOS - We are working towards writing the model into an App Group container, tricky but working around it.

Android - We are working towards prompting the user once for a SAF directory (e.g., /Download/llm_models), save the model there, then publish a ContentProvider URI for zero-copy reads.

We are already writing more mobile-friendly kernels and Tensors, but GGML/GGUF is widely supported, porting it is an easy way to get started and collect feedback, but we will completely move away from in < 2 months.

Anything else you would like to know?

  • How does writing a model into an App Group container enable your framework to enable an app to enable a local LLM server that 3rd party apps can make calls to on iOS?[^1]

    How does writing a model into a shared directory on Android enable a local LLM server that 3rd party apps can make calls to?[^2]

    How does writing your own kernels get you off GGUF in 2 months? GGUF is a storage format. You use kernels to do things with the numbers you get from it.

    I thought GGUF was an advantage? Now it's something you're basically done using?

    I don't think you should continue this conversation. As easy it as it is to get your work out there, it's just as easy to build a record of stretching truth over and over again.

    Best of luck, and I mean it. Just, memento mori: be honest and humble along the way. This is something you will look back on in a year and grimace.

    [^1] App group containers only work between apps signed from the same Apple developer account. Additionally, that is shared storage, not a way to provide APIs to other apps.

    [^2] SAF = Storage Access Framework, that is shared storage, not a way to provide APIs to other apps.