Comment by tgrowazay
1 month ago
We can observe how much generic inference providers like deepinfra or together-ai charge for large SOTA models. Since they are not subsidized and they don’t charge 7x of OpenAI, that means OAI also doesn’t have outrageously high per-token costs.
Actually, that doesn’t mean anything.
OAI is running boundary pushing large models. I don’t think those “second tier” applications can even get the GPUs with the HBM required at any reasonable scale for customer use.
Not to mention training costs of foundation models