← Back to context

Comment by frozenseven

1 year ago

What was the budget for DeepSeek's V3/R1 again?

4 comments

frozenseven

Reply

anonymousDan 1 year ago

Who actually knows? Far beyond what a UK university can afford.

frozenseven 1 year ago
I see. Let's just assume that DeepSeek's V3/R1 budget of ~$5.5M was a lie and the Alan Turing Institute is just too poor to compete with their nine-figure sums. I guess I have no further questions.
- bubbler 1 year ago
  
  Yeah, the DeepSeek budget wasn't 6M by any means.
  > the $5-6M cost of training is misleading. It comes from the claim that 2048 H800 cards were used for one training, which at market prices is upwards of $5-6M. Developing such a model, however, requires running this training, or some variation of it, many times, and also many other experiments (item 3 below). That makes the cost to be many times above that, not to mention data collection and other things, a process which can be very expensive (why? item 4 below). Also, 2048 H800 cost between $50-100M. The company that deals with DC is owned by a large Chinese investment fund, where there are many times more GPUs than 2048 H800.
  https://therecursive.com/martin-vechev-of-insait-deepseek-6m...
  
  1 reply →