← Back to context

Comment by giancarlostoro

19 hours ago

> the dense 9B fits on a single 80GB GPU

Us mere mortals cannot use this.

3 comments

giancarlostoro

Reply

regularfry 2 hours ago

Seems weird. A 9B model would normally fit unquantised on a 24GB GPU.

armarr 13 hours ago

There are already quantizations available

giancarlostoro 5 hours ago

It would be nice to run a model that isn't quantized to death so it fits in 12GB of VRAM so I have room for reasonable context window, but also, this is ONE model in a set of models, the rest of the models need to run in a GPU cluster apparently.