Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by jojobas

18 hours ago

China has cheap electricity.

2 comments

jojobas

Reply

ericd  18 hours ago

Well, also, LLM servers get much more efficient with request queue depth >1 - tokens per second per gpu are massively higher with 100 concurrents than 1 on eg vllm.

DeathArrow  11 hours ago

Yes, but the hardware they use for inference like Huawei Ascend 910C is less efficient than Nvidia H100 used in US due to the difference in the process node.

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities