Comment by gnulinux
8 months ago
Yes it would matter. If you just have budget to run a 8B model and it's sufficient for the easy problem you have, a better 8B model with the same spec requirements is necessarily better regardless of how it compares to some other model. I have tons of problems I throw a specific sized model at.
> a better 8B model with the same spec requirements
It's not the completely same spec requirements though. When using an alloy, you would need to have double the disk space (not a huge deal on desktop, but for mobile), significantly higher latency (as you need to swap the models in/out between every turn), and you can only apply it to multi-turn conversations/sufficiently decomposable problems.