Comment by ls612

3 days ago

The estimates I've seen are that running inference at scale on a Deepseek V3 sized model (so 700B parameters) costs roughly $0.70/mtok or so given current H100 rental costs. Sonnet charges $15/mtok on the API so the delta between the true cost and the API cost is quite large, to the point where even many subscription users are likely profitable.

0 comments