Comment by paoliniluis

6 days ago

just finding the perfect spot between accuracy of the answers/available VRAM/tokens per second

Ok, say I have 14GB VRAM. What is the tradeoff between using 9B with 8-bit params vs 27B with 3-bit params?