← Back to context

Comment by atreids

1 day ago

I find just going via Deepseek's platform API directly, using their V4 flash model, and hooking into a harness like Opencode more than acceptable. Think I've spent maybe $10 over a couple of weeks.

I did explore self-hosting models but hardware right now is just too expensive.

I believe opencode go but only using deepseek flash would last you longer. (Equivalent to $65 in tokens but it's a monthly payment so you have to be using it up or deepseek direct will be cheaper)

First month is $5, later $10. Cancel any time. You can keep getting the deal with a new email.

  • Interesting. Thanks for letting me know. I will investigate that if I end up finding the API too expensive.

Directly at DeepSeek? It was my understanding (but I didn't check) that some other AI operators were providing (some of?) DeepSeek's model for cheaper prices.

Still, that's interesting. What do you get for that price? Only coding, or also e.g. image generation?

  • Footprint's comment is correct. I go directly to Deepseek's platform API which they linked. There's no image generation but you get access to Deepseek V4 Flash and Deepseek V4 Pro, both of which are very capable for general text based tasks and programming. Flash is insanely cheap for how good it is ($0.14 per 1M input tokens vs $15 with Claude 4.7). V4 Pro I would put somewhere in the range of 80 to 90% as good as Opus 4.6 (based just on anecdotal usage - I use Opus 4.6 heavily at work as my company pays for it) while again being significantly cheaper. According to a benchmark[1] I read, processing 1million tokens would cost you $250 for Opus 4.7, $300 on GPT5.5... and just $35 on V4 Pro.

    I just use it for my side-project coding and brainstorming tasks. At work I use AWS's Kiro CLI + Opus 4.6. At home I use Opencode + V4 Flash for the majority of "general" usage. I swap to V4 Pro for complex tasks if I feel like V4 Flash is struggling.

    One other thing I highly like about the platform.deepseek API usage is it's a metered setup - not subscription based. Which means you only pay for what you use (the money that you put in doesn't expire) and can't spend more than you've deposited. This works well for me for my non-work coding because it generally happens in bursts. I may not code for a whole month (and therefore if I had a subscription it would have been wasted) and then spend a whole weekend coding nonstop.

    It's entirely possible that there are middle-man providers that give a discount on Deepseek's own pricing, but I'm quite happy with the amount I'm paying so I haven't really looked into it.

    [1]: https://lushbinary.com/blog/deepseek-v4-vs-claude-opus-4-7-v...

  • I’ve been doing this too, it’s a cheat code! 1/100th of the price of Claude/openai prices for 95% of the quality. Site is platform.deepseek.com for that. No image generation, just text, but if you use it right it works great

  • DeepSeek API gave 6x to 8x better caching rate for inputs over OpenRouter (even chosing DeepSeek as provider). And some of the cheaper providers are using FP4 quantizations.

    https://openrouter.ai/deepseek/deepseek-v4-flash-20260423#pr...

    After complaints the cached read is not listed anymore in that page, you have to click one by one. All providers for DeepSeek V4 Flash charge ~$0.02 while DeepSeek provider is $0.0028. For coding this is huge as caching often gets in the range of 90 to 99%. But OpenRouter messes your caching so don't use it. And it seems to be a VC-backed closed middle-man company, not open source or open anything.

    • Openrouter's pricing via the deepseek provider is the same as the official deepseek api for both flash and pro and for cached and uncached tokens. It's literally the same api.

      And no, cache rates are not different if you're going through the official deepseek provider. The only way caching rates can drop is if you let openrouter fully control routing by preferring uptime or something, and then it might bounce you between providers. But you can control which providers for a given model are in its routing pool and stop that.

      1 reply →