Comment by nateb2022
3 days ago
> but people should use llama.cpp instead
MLX is a lot more performant than Ollama and llama.cpp on Apple Silicon, comparing both peak memory usage + tok/s output.
edit: LM Studio benefits from MLX optimizations when running MLX compatible models.
No comments yet
Contribute on Hacker News ↗