Comment by mswphd
3 hours ago
An "obvious" point to make is that it is not particularly usable on a unified memory machine. Only getting 9 tok/s (for Q6 quants) using a Macbook M4 Pro 48GB memory (though with GGUFs, not mlx).
The quality seems fine, but the 9 tok/s mean I only tried it out briefly.
No comments yet
Contribute on Hacker News ↗