Comment by yorwba
8 hours ago
Your source cites https://www.washingtonpost.com/technology/2024/09/18/energy-... which in turn claims to be based on https://arxiv.org/abs/2304.03271 but uses 0.14 kWh as the energy consumption for a 100-token request to GPT-4, which is an order of magnitude larger than any figure in that paper. Based on a speed of 18 tokens/s https://openrouter.ai/openai/gpt-4/performance the implied power draw is ≈91 kW, about two thirds of a 72-GPU rack https://www.supermicro.com/datasheet/datasheet_SuperCluster_... I somewhat doubt that the model is large enough to require an entire rack's worth of GPU memory, but even if that were the case, a single request is going to get batched with hundreds or thousands of others at the same time, so the true energy consumption should be much smaller than that.
No comments yet
Contribute on Hacker News ↗