Comment by bel8

9 days ago

DeepSeek Flash on high (not max) is a freak of nature indeed.

Very disproportionate intelligence-to-cost ratio.

I'm leveraging this temporary anomaly and using it as my coding workhorse.

The weights are open and when prices settle down again will be runnable with less than 10k of hardware.

I can easily run it in a 8 bit quant with the 4 x 48GB Radeon Pro W7900 GPUs I snagged for 2k each before the memory squeeze.

A 158B parameter model, especially in an architecture as efficient as DS4 is not that hard to drive currently if you got in before the craze, and will be relatively easy to drive with future hardware generations.