Comment by keeda

4 months ago

> These things are so hideously inefficient.

Quite the opposite, really. I did some napkin math for energy and water consumption, and compared to humans these things are very resource efficient.

If LLMs improve productivity by even 5% (studies actually peg productivity gains across various professions at 15 - 30%, and these are from 2024!) the resource savings by accelerating all knowledge workers are significant.

Simplistically, during 8 hours of work a human would consume 10 kWH of electricity + 27 gallons of water. Sped up by 5%, that drops by 0.5kWH and 1.35 gallons. Even assuming a higher end of resources used by LLMs, a 100 large prompts (~1 every 5 minutes) would only consume 0.25 kWH + 0.3 gallons. So we're still saving ~0.25 kWH + 1 gallon overall per day!

That is, humans + LLMs are way more efficient than humans alone. As such, the more knowledge workers adopt LLMs, the more efficiently they can achieve the same work output!

If we assume a conservative 10% productivity speed up, adoption across all ~100M knowledge work in the US will recoup the resource cost of a full training run in a few business days, even after accounting for the inference costs!

Additional reading with more useful numbers (independent of my napkin math):

https://www.nature.com/articles/s41598-024-76682-6

https://cacm.acm.org/blogcacm/the-energy-footprint-of-humans...

So with the AI is doing more of the work and you need less humans, what are you doing with the extra humans to eliminate their no-longer-productive resource consumption?

Saying “we can do the same work with less resource use” doesn’t mean resource consumption is reduced. You’ve just gone from humans using resources to humans using the same resources and doing less work, plus AI using more resources.

  • Resource consumption often goes up. It's a time vs energy tradeoff and it's not free.

    Your question is a variant of what do we do with all those humans now that they don't have to walk miles to the well every day because we invented aqueducts? The point is that they didn't want to walk to the well but they had to (and in some places they still have to) and very few people want to work, even now and even us, but they have to.

    We will see what happens this time when we won't have to walk to that well.

  • > So with the AI is doing more of the work and you need less humans, what are you doing with the extra humans to eliminate their no-longer-productive resource consumption?

    Soon enough, we won't be able to avoid this question.

  • The thing is, there are many interplaying dynamics here that are impossible to unravel. This is why I called it "napkin math", because figuring out the full ramifications of this change is a pretty large economic problem that nobody has figured out!

    For instance, I think operating at this level of productivity is unsustainable (https://news.ycombinator.com/item?id=46896066)

    There are many more dynamics at play of course, but I think an equilibrium will be found purely because everyone is incentivized to find a solution (UBI?) that keeps both the elites and the plebes living long and prospering. I expect some turmoil, but luckily, the severe resource crunch of GPUs gives us time to figure things out.

Do keep in mind that 1 large prompt every 5 minutes is not how e.g. coding agents are used. There it's 1 large prompt every couple of seconds.

  • True, but I think in these scenarios they rely on prompt caching, which is much cheaper: https://ngrok.com/blog/prompt-caching/

    I have no expertise here, but a couple years ago I had a prototype using locally deployed Llama 2 that cached the context (now deprecated https://github.com/ollama/ollama/issues/10576) from previous inference calls, and reused it for subsequent calls. The subsequent calls were much much faster. I suspect prompt caching works similarly, especially given changed code is very small compered to the rest of the codebase.

Are you excluding the cost of training the AI from the calculation?

  • In the initial analysis of a single worker, yes, but when scaling up per-human savings to use by the wider population, the aggregate resource savings compensate for training resource usage within a few days, weeks at most.

How is a human consuming 27 gallons of water in an 8 hour work shift?