Comment by wongarsu
7 hours ago
economies of scale are enough to explain the entire price difference. Running 8 concurrent requests at 100 token/s on $100k hardware is a lot cheaper than running one concurrent request at 20 token/s on $20k hardware
7 hours ago
economies of scale are enough to explain the entire price difference. Running 8 concurrent requests at 100 token/s on $100k hardware is a lot cheaper than running one concurrent request at 20 token/s on $20k hardware
No comments yet
Contribute on Hacker News ↗