Comment by tarruda
7 hours ago
I only tried a very early version of that when it was just a llama.cpp fork and Qwen was certainly better in my tests.
But I was not super impressed with deepseek 4 flash using it from the official API either, so it doesn't seem quantization fault. It is a good model, but nothing out of the ordinary in the few benchmarks I ran on it (with full awareness that benchmarks are biased).
No comments yet
Contribute on Hacker News ↗