Comment by karel-3d

6 hours ago

Can I... somehow run this locally? DeepSeek is opensource? Do I even need their API key?

(I have no experience with running anything locally, maybe it's a stupid question)

Waiting for official support in llama.cpp. There is a fork that can run a lightly quantized (Q2 expert layers) DeepSeek V4 Flash in 128GB RAM without offloading weight fetches from disk.