Comment by hebejebelus
3 days ago
An interesting thought experiment - a fully local, off-grid, off-network LLM device. Solar or wind or what have you. I suppose the Mac Studio route is a good option here, I think Apple make the most energy efficient high-memory options. Back of the napkin indicates it’s possible, just a high up front cost. Interesting to imagine a somewhat catastrophe-resilient LLM device…
Macs would be the most power efficient with faster memory but an AI Max 395+ based system would probably be the most cost efficient right now. A Framework Desktop with 128GB of shared RAM only pulls 400W (and could be underclocked) and is cheaper by enough that you could buy it plus 400W of solar panels and a decently large battery for less than a Mac Studio with 128GB of RAM. Unfortunately the power efficiency win is more expensive than just buying more power generation and storage ability.
I suppose in terms of catastrophe resilience repairability would be important, although how do you repair a broken GPU in any case. Probably cold backup machines is probably the more feasible way to extend lifetimes.
And yeah - I was thinking that actually power efficiency isn’t really a massive deal if you have some kind of thin client setup. The LLM nodes can be at millraces or some other power dense locations, and then the clients are basically 5W displays with an RF transceiver and a keyboard…
An entertaining thought experiment :)
That is the endgame.
I think we are moving toward a bilayered compute model: The Cloud: For massive reasoning.
The Local Edge: A small, resilient model that lives on-device and handles the OS loop, privacy, and immediate context.
BrainKernel is my attempt to prototype that Local Edge layer. Its messy right now, but I think the OS of 2030 will definitely have a local LLM baked into the kernel.
Well, on my Macbook, some of that already exists. In the Shortcuts app you can use the "Use Model" action which offers to run an LLM on apple's cloud, on-device, or other external service (eg ChatGPT). I use this myself already for several actions, like reading emails from my tennis club to put events in my calendar automatically.
Whether or not we'll see it lower down in the system I'm not sure. Honestly I'm not certain of the utility of an autonomous LLM loop in many or most parts of an OS, where (in general) systems have more value the more deterministic they are, but in the user space, who can say.
In any case, I certainly went down a fun rabbit hole thinking about a mesh network of LLM nodes and thin clients in a post-collapse world. In that scenario, I wonder if the utility of LLMs is really worth the complexity versus a kindle-like device with a copy of wikipedia...