Comment by dzogchen

1 year ago

That one can handle up to 200B parameters according to NVIDIA.

That's a shame. I suppose you'll need 4 of them with RDMA to run a 671B, but somehow that seems better to me than trying to run it on DDR4 RAM like the OP is saying. I have a system with 230G of DDR4 RAM, and running even small models on it is atrociously slow.