Comment by mseepgood
14 hours ago
The question is how many decades each user of your software would have to use it in order to offset, through the optimisation it provides, the energy consumption you burned through with LLMs.
14 hours ago
The question is how many decades each user of your software would have to use it in order to offset, through the optimisation it provides, the energy consumption you burned through with LLMs.
When global supply chains are disrupted again, energy and/or compute costs will skyrocket, meaning your org may be forced to defer hardware upgrades and LLMs may no longer be cost effective (as over-leveraged AI companies attempt to recover their investment with less hardware than they'd planned.) Once this happens, it may be too late to draw on LLMs to quickly refactor your code.
If your business requirements are stable and you have a good test suite, you're living in a golden age for leveraging your current access to LLMs to reduce your future operational costs.
Would it be that many? Asked AI to do some rough calculation, and it spit that:
Making 50 SOTA AI requests per day ≈ running a 10W LED bulb for about 2.5 hours per day
Given I usually have 2-3 lights on all day in the house, that's like 1500 LLM requests per day (which sounds quite more than I do).
So even a month worth of requests for building some software doesn't sound that much. Having a local beefy traditional build server compling or running tests for 4 hours a day would be like ~7,600 requests/day
> Making 50 SOTA AI requests per day ≈ running a 10W LED bulb for about 2.5 hours per day
This seems remarkably far from what we know. I mean, just to run the data centre aircon will be an order of magnitude greater than that.
Is that true? Because that's indeed FAR less than I thought. That would definitely make me worry a lot less about energy consumption (not that I would go and consume more but not feeling guilty I guess).
A H100 uses about 1000W including networking gear and can generate 80-150 t/s for a 70B model like llama.
So back of the napkin, for a decently sized 1000 token response you’re talking about 8s/3600s*1000 = 2wh which even in California is about $0.001 of electricity.
1 reply →
In the past week I made 4 different tasks that were going to make my m4brun at full tilt for a week optimized down to 20 minutes with just a few prompts. So more like an hour to pay off not decades. average claude invocation is .3 wh. m4 usez 40-60 watts, so 24x7x40 >> .3 * 10
Especially considering that suddenly everyone and their mother create their own software with LLMs instead of using almost-perfect-but-slighty-non-ideal software others written before.
I’m not really worried about energy consumption. We have more energy falling out of the sky than we could ever need. I’m much more interested in saving human time so we can focus on bigger problems, like using that free energy instead of killing ourselves extracting and burning limited resources.