Comment by mirekrusin
3 months ago
From LLM benchmarks it looks like it's better to use open source uzu than RunAnywhere's proprietary inference engine.
3 months ago
From LLM benchmarks it looks like it's better to use open source uzu than RunAnywhere's proprietary inference engine.
uzu is a strong engine, it beat us on Llama-3.2-3B (222 vs 184 tok/s) and we reported that honestly in our benchmarks.
But looking at the full picture across all four models tested:
Qwen3-0.6B: MetalRT 658, uzu 627
Qwen3-4B: MetalRT 186, uzu 165
Llama-3.2-3B: uzu 222, MetalRT 184
LFM2.5-1.2B: MetalRT 570, uzu 550
MetalRT wins 3 of 4. The bigger difference is that MetalRT also handles STT and TTS natively, uzu is LLM-only. For a voice pipeline where you need all three modalities running on one engine with shared memory management, that matters.
That said, uzu is great open-source software and worth checking out if your looking for an OSS LLM-only engine on Apple Silicon.