Comment by wslh
1 day ago
I'm not really asking this from the perspective of whether I should buy hardware. I'm trying to understand the economics.
The AI space is moving so fast that it is hard to know which conclusions are stable. After all the discussion around local models, is the practical conclusion still that API/frontier providers have a huge structural advantage because of datacenter hardware, high utilization, batching, optimized inference stacks, and perhaps strategic pricing?
In a comparison like this, a $25k local setup versus buying tokens, what multiple are we really talking about? 10x? 100x? Or is it too workload-dependent to reduce to a single number?
Has someone written a good breakdown that separates true infrastructure efficiency from temporary underpricing/subsidy? The part I'm trying to understand is less ideological (local vs. cloud) and more basic economics.
The speed of results for an API call to ChatGPT is 10-100x faster than my local LLM. I haven't exactly quantified the results but I was getting results in a few seconds vs 10+ minutes for my local LLM. I'm going to do a deep dive this weekend and try to get better results, but it was staggering. I'll also do a deep dive on how to optimize my setup and see if I can get things to perform much quicker.