Comment by NeutralCrane
1 year ago
If you have so little throughput that you don’t need more than 3 tokens a second, you are processing so little data that your costs from the LLM providers won’t even sniff the $2000 you will spend on the hardware to self host.
No comments yet
Contribute on Hacker News ↗