← Back to context

Comment by bee_rider

16 hours ago

Something I’ve been sort of wondering about—LLM training seems like it ought to be the most dispatchable possible workload (easy to pause the thing when you don’t have enough wind power, say). But, when I’ve brought this up before people have pointed out that, basically, top-tier GPU time is just so valuable that they always want to be training full speed ahead.

But, hypothetically if they had a ton of previous gen GPUs (so, less efficient) and a ton of intermittent energy (from solar or wind) maybe it could be a good tradeoff to run them intermittently?

Ultimately a workload that can profitably consumer “free” watts (and therefore flops) from renewable overprovisioning would be good for society I guess.

This is a problem with basically all "spare power" schemes: paying for the grid hookup and land on which you situate your thing isn't free, as well as the interest rate cost of capital; so the lower the duty cycle the less economic it is.

First: Almost anything can be profitable if you have free inputs.

Second: Even solar and wind are not really "free" as the capital costs still depreciate over the lifetime of the plant. You might be getting the power for near-zero or even negative cost for a short while, but the power cost advantage will very quickly be competed away since it's so easy to spend a lot of energy. Even remelting recycled metals would need much less capital investment than even a previous-gen datacentre.

That leaves the GPUs. Even previous gen GPUs will still cost money if you want to buy them at scale, and those too depreciate over time even if you don't use them. So to get the maximum value out of them, you'd want to run them as much as possible, but that contradicts the business idea of utilizing low cost energy from intermittent sources.

Long story short: in might work in very specific circumstances if you can make the numbers work. But the odds are heavily stacked against you because typically energy costs are relatively minor compared to capital costs, especially if you intend to run only a small fraction of the time when electricity is cheap. Do your own math for your own situation of course. If you live in Iceland things might be completely different.

  • They are amazing at making batteries as well. How does adding batteries to the mix change the calculation?

    • Exactly as you'd expect: they make it possible to run the GPUs more hours in exchange for needing additional capital. Those batteries will have an upfront cost and will depreciate over time. You'll obviously also need more solar panels than before, which also further increases the upfront investment. Also note that now we're already straying away from the initial idea of "consuming free electricity from renewable overprovisioning". If you have solar panels and a battery, you can also just sell energy to the grid instead of trying to make last-gen GPUs profitable by reducing energy costs.

      Again: it might work, if the math checks out for your specific source of secondhand GPUs and/or solar panels and/or batteries.

> top-tier GPU time is just so valuable that they always want to be training full speed ahead.

I don't think this makes much sense because the "waste" of hardware infrastructure by going from 99.999% duty cycle to 99% is still only ~1%. It's linear in the fraction of forgone capacity, while the fraction of power costs you save from simply shaving off the costliest peaks and shifting that demand to the lows is superlinear.

I think as such intermittent power comes on the grid in the coming decades, people will find creative uses for it.