Comment by d3Xt3r

13 hours ago

Got a link to that API inference provider?

4 comments

d3Xt3r

Just look up OpenRouter, OpenCode Go/Zen, Together, Fireworks, Cerebras, etc.

DeepSeek Platform API is worth checking out too, due to their insanely good caching and token costs.

andai 3 hours ago

I use DeepSeek via OpenRouter, the caching seems to work there too, you just need to force it to use DeepSeek as a provider otherwise it picks a random one every time. (You can pass a provider option in the call, or better, create a preset in your account.)

I'm Ollama Cloud which has a coding plan style model but without restrictions on the harness or direct API calls from your code.

I use novita ai