Comment by linolevan

13 hours ago

I'm not convinced that LLM training is at such a high energy use that it really matters in the big picture. You can train a (terrible) LLM on a laptop[1], and frankly that's less energy efficient than just training it on a rented cloud GPU.

Most of the innovation happening today is in post-training rather than pre-training, which is good for people concerned with energy use because post-training is relatively cheap (I was able to post-train a ~2b model in less than 6 hours on a rented cluster[2]).

[1]: https://github.com/lino-levan/wubus-1 [2]: https://huggingface.co/lino-levan/qwen3-1.7b-smoltalk