Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by randomNumber7

7 days ago

Once you have enough GPUs to have your whole model available in GPU RAM you can do inference pretty fast.

As soon as you have enough users you can let your GPUs burn with a high load constantly, while your home solution would idle most of the time and therefore be way too expensive compared to the value.

0 comments

randomNumber7

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities