Comment by 2001zhaozhao
8 days ago
It really is cursed to be spending hundreds of watts of power in a datacenter somewhere to make a laptop run slightly faster.
8 days ago
It really is cursed to be spending hundreds of watts of power in a datacenter somewhere to make a laptop run slightly faster.
oh absolutely. burning a coal plant to decide if i should close discord is peak 2025 energy. strictly speaking, using the local model (Ollama) is 'free' in terms of watts since my laptop is on anyway, but yeah, if the inefficiency is the art, I'm the artist.
Running ollama to compute inference uses energy that wouldn't have been used if you weren't running ollama. There's no free lunch here.
An interesting thought experiment - a fully local, off-grid, off-network LLM device. Solar or wind or what have you. I suppose the Mac Studio route is a good option here, I think Apple make the most energy efficient high-memory options. Back of the napkin indicates it’s possible, just a high up front cost. Interesting to imagine a somewhat catastrophe-resilient LLM device…
Macs would be the most power efficient with faster memory but an AI Max 395+ based system would probably be the most cost efficient right now. A Framework Desktop with 128GB of shared RAM only pulls 400W (and could be underclocked) and is cheaper by enough that you could buy it plus 400W of solar panels and a decently large battery for less than a Mac Studio with 128GB of RAM. Unfortunately the power efficiency win is more expensive than just buying more power generation and storage ability.
1 reply →
That is the endgame.
I think we are moving toward a bilayered compute model: The Cloud: For massive reasoning.
The Local Edge: A small, resilient model that lives on-device and handles the OS loop, privacy, and immediate context.
BrainKernel is my attempt to prototype that Local Edge layer. Its messy right now, but I think the OS of 2030 will definitely have a local LLM baked into the kernel.
1 reply →
> using the local model (Ollama) is 'free' in terms of watts since my laptop is on anyway
Now that’s a cursed take on power efficency
efficiency is just a mindset. if i save 3 seconds of my own attention by burning 300 watts of gpu, the math works out in my favor!
1 reply →
An entire datacenter on the other hand, might be appealing to spot things you wouldn't otherwise see in a sea of logs and graphs.