Comment by ls612
3 days ago
The estimates I've seen are that running inference at scale on a Deepseek V3 sized model (so 700B parameters) costs roughly $0.70/mtok or so given current H100 rental costs. Sonnet charges $15/mtok on the API so the delta between the true cost and the API cost is quite large, to the point where even many subscription users are likely profitable.
No comments yet
Contribute on Hacker News ↗