Comment by squidbeak

11 hours ago

Deepseek had another moment a few weeks ago. V4 isn't far behind the US frontier, and so far its flash variant seems a very reliable coder and costs a pittance.

15 comments

squidbeak

ai_fry_ur_brain 11 hours ago

Deepseek V4 (not flash) trippled in price too by the way (from Deepseek). Get used to this pattern.

This is what you get for relying on the generosity of billionaires. Keep offshoring your thinking ability to a machine and let me know how competitive you. Hint, you wont be. There's nothing special about being able to use an LLM.

npn 11 hours ago
Unlike other providers, Deepseek does promise that they will lower the price when their Huawei cards arrive in a few more months.
- flakiness 10 hours ago
  
  Give me a link. Cannot wait. One PSA is that they have 75% discount right now so it is already cheaper than the full price.
  
  1 reply →
barrell 2 hours ago

Actually, deepseek v4 was 1/3 promotional price for the first month or so. This was pretty clearly communicated. The promotions window just ended is all.
ls612 11 hours ago
Anyone can host Deepseek V4 on rented GPUs and sell inference on it. Price will very quickly converge to the marginal cost of inference. This is as close to a pure commodity as it gets in the AI space so competitive market economics will put in work. Same is true for any open-weights model.
- ai_fry_ur_brain 11 hours ago
  
  You dont understand the costs involved to run inference at scale
  Please go run some numbers.The hardware needed to Run Deepseek v4 flash at 20 tps for a single session is nowhere close to what is required to run it at 50tps for 5,000 concurrent sessions.
  Imagine what it takes to be profitible when running at 150 tps for 30cents per 1mm. You make less than 1k per month and the hardware required to run that cost 10k a month to rent with hardly any concurrent session capability.
  
  3 replies →
dpoloncsak 11 hours ago

Mate why are you so mad at people upset the price trippeled? It's a fair complaint that people built services using the cheaper ones with the expectation future models would be similarly priced. You can avoid 'offloading thinking' while still building ontop of these models
zaptrem 10 hours ago

V4-Pro is about 2.4× total params and 1.3× active params of V3.2.
creationcomplex 9 hours ago

You're typing as your handwriting and letter sending abilities deteriorate to dust. Writing down information as your memory capacity decays. Remembering instead of living at the pure leading edge of perception dulling your reactions.
Smh, it's all downhill from the first unadulterated neuron.
aurareturn 11 hours ago
I think demand is too great and compute is not enough. Nothing to do with billionaires colluding to increase prices by 3x.
- boutell 9 hours ago
  
  Actually, why should Google collude on pricing? They have deep pockets and could starve out the competition while keeping prices low, if they really wanted.
  I think it is priced high because it's basically their smartest model as well as their fastest, so why shouldn't they?
  You can still use earlier generations of Flash at a lower cost if you want "fast and cheap and just OK," which often makes sense. (Just checked)
  I would predict they will lower this price when 3.5 High appears, but perhaps not all the way.