Comment by Yoric

1 day ago

Directly at DeepSeek? It was my understanding (but I didn't check) that some other AI operators were providing (some of?) DeepSeek's model for cheaper prices.

Still, that's interesting. What do you get for that price? Only coding, or also e.g. image generation?

Footprint's comment is correct. I go directly to Deepseek's platform API which they linked. There's no image generation but you get access to Deepseek V4 Flash and Deepseek V4 Pro, both of which are very capable for general text based tasks and programming. Flash is insanely cheap for how good it is ($0.14 per 1M input tokens vs $15 with Claude 4.7). V4 Pro I would put somewhere in the range of 80 to 90% as good as Opus 4.6 (based just on anecdotal usage - I use Opus 4.6 heavily at work as my company pays for it) while again being significantly cheaper. According to a benchmark[1] I read, processing 1million tokens would cost you $250 for Opus 4.7, $300 on GPT5.5... and just $35 on V4 Pro.

I just use it for my side-project coding and brainstorming tasks. At work I use AWS's Kiro CLI + Opus 4.6. At home I use Opencode + V4 Flash for the majority of "general" usage. I swap to V4 Pro for complex tasks if I feel like V4 Flash is struggling.

One other thing I highly like about the platform.deepseek API usage is it's a metered setup - not subscription based. Which means you only pay for what you use (the money that you put in doesn't expire) and can't spend more than you've deposited. This works well for me for my non-work coding because it generally happens in bursts. I may not code for a whole month (and therefore if I had a subscription it would have been wasted) and then spend a whole weekend coding nonstop.

It's entirely possible that there are middle-man providers that give a discount on Deepseek's own pricing, but I'm quite happy with the amount I'm paying so I haven't really looked into it.

[1]: https://lushbinary.com/blog/deepseek-v4-vs-claude-opus-4-7-v...

  • This is awesome, thank you for posting this. Are you using pi, Hermes, or another harness? I’d love to hear more about your workflow.

    • I've been using [OpenCode](https://opencode.ai/) - I find it works quite well and has things like web search and build/plan modes built in. I had to modify the settings though (On Linux at `~/.config/opencode/opencode.json`) to stop it from just modifying files without first asking for permission, which I didn't like. I like being able to read the changes my AI agent suggests before the files are modified.

I’ve been doing this too, it’s a cheat code! 1/100th of the price of Claude/openai prices for 95% of the quality. Site is platform.deepseek.com for that. No image generation, just text, but if you use it right it works great

DeepSeek API gave 6x to 8x better caching rate for inputs over OpenRouter (even chosing DeepSeek as provider). And some of the cheaper providers are using FP4 quantizations.

https://openrouter.ai/deepseek/deepseek-v4-flash-20260423#pr...

After complaints the cached read is not listed anymore in that page, you have to click one by one. All providers for DeepSeek V4 Flash charge ~$0.02 while DeepSeek provider is $0.0028. For coding this is huge as caching often gets in the range of 90 to 99%. But OpenRouter messes your caching so don't use it. And it seems to be a VC-backed closed middle-man company, not open source or open anything.

  • Openrouter's pricing via the deepseek provider is the same as the official deepseek api for both flash and pro and for cached and uncached tokens. It's literally the same api.

    And no, cache rates are not different if you're going through the official deepseek provider. The only way caching rates can drop is if you let openrouter fully control routing by preferring uptime or something, and then it might bounce you between providers. But you can control which providers for a given model are in its routing pool and stop that.

    • Last month I had this issue. Others confirmed. On X people say OpenRouter messes with headers or something (this I can't confirm).