Comment by hobofan
8 months ago
> a better 8B model with the same spec requirements
It's not the completely same spec requirements though. When using an alloy, you would need to have double the disk space (not a huge deal on desktop, but for mobile), significantly higher latency (as you need to swap the models in/out between every turn), and you can only apply it to multi-turn conversations/sufficiently decomposable problems.
No comments yet
Contribute on Hacker News ↗