Comment by karel-3d

6 hours ago

Can I... somehow run this locally? DeepSeek is opensource? Do I even need their API key?

(I have no experience with running anything locally, maybe it's a stupid question)

2 comments

karel-3d

Waiting for official support in llama.cpp. There is a fork that can run a lightly quantized (Q2 expert layers) DeepSeek V4 Flash in 128GB RAM without offloading weight fetches from disk.

karel-3d 5 hours ago

Ouch. Can't run that on my M4 mac with 48GB RAM.