Comment by nmca

6 months ago

Isn’t the answer to the question just classic economies of scale?

You can’t run GPT4 for yourself because the fixed costs are high. But the variable costs are low, so OAI can serve a shit ton.

Or equivalently the smallest available unit of “serving a gpt4” is more gpt4 than one person needs.

I think all the inference optimisation answers are plain wrong for the actual question asked?

1 comment

nmca

It’s the same principle as: