Comment by sandworm101
5 hours ago
Running the model isnt the cost. Watts per token is the math they show investors. You also have to be constantly training new models, which currently needs more compute than servicing the customer base. You have to biuld datacenters, and possibly powerplants to feed them. You have to carry debts. And you will need to buy new GPUs/ram every few years to remain competative. The total business is vastly different than simple gpu math.
You are in violent agreement.
> inference is indeed profitable