← Back to context

Comment by mirekrusin

3 months ago

From LLM benchmarks it looks like it's better to use open source uzu than RunAnywhere's proprietary inference engine.

[0] https://github.com/trymirai/uzu

uzu is a strong engine, it beat us on Llama-3.2-3B (222 vs 184 tok/s) and we reported that honestly in our benchmarks.

But looking at the full picture across all four models tested:

Qwen3-0.6B: MetalRT 658, uzu 627

Qwen3-4B: MetalRT 186, uzu 165

Llama-3.2-3B: uzu 222, MetalRT 184

LFM2.5-1.2B: MetalRT 570, uzu 550

MetalRT wins 3 of 4. The bigger difference is that MetalRT also handles STT and TTS natively, uzu is LLM-only. For a voice pipeline where you need all three modalities running on one engine with shared memory management, that matters.

That said, uzu is great open-source software and worth checking out if your looking for an OSS LLM-only engine on Apple Silicon.