Comment by rshemet

3 days ago

Thanks for the feedback. You're right to point out that Google AI Edge is cross-platform and more flexible than our phrasing suggested.

The core distinction is in the ecosystem: Google AI Edge runs tflite models, whereas Cactus is built for GGUF. This is a critical difference for developers who want to use the latest open-source models.

One major outcome of this is model availability. New open source models are released in GGUF format almost immediately. Finding or reliably converting them to tflite is often a pain. With Cactus, you can run new GGUF models on the day they drop on Huggingface.

Quantization level also plays a role. GGUF has mature support for quantization far below 8-bit. This is effectively essential for mobile. Sub-8-bit support in TFLite is still highly experimental and not broadly applicable.

Last, Cactus excels at CPU inference. While tflite is great, its peak performance often relies on specific hardware accelerators (GPUs, DSPs). GGUF is designed for exceptional performance on standard CPUs, offering a more consistent baseline across the wide variety of devices that app developers have to support.

3 comments

rshemet

deepdarkforest 3 days ago

No worries.

GGUF is more suitable for the latest open-source models, i agree there. Quant2/Q4 will probably be critical as well, if we don't see a jump in ram. But then again I wonder when/If mediapipe will support GGUF as well.

PS, I see you are in the latest YC batch? (below you mentioned BF). Good luck and have fun!

blks 2 days ago

First paragraph reads like chat gpt response.

poly2it 2 days ago

Not just the first paragraph, the whole response reads like LLM output.