← Back to context

Comment by holoduke

17 hours ago

I don't think that's true. Nothing points to specialized LLMs being better. General purpose LLMs are just much more useful in daily work.

To be more specific, I think the future is local and specialized. IBM among others thought the same way with their giant mainframe centralized computers and the original way people would utilize software in the 70s. It's an interesting parallel to today's cloud if you think about it. It's just not scalable from a resource (hardware), energy, and cost perspective. I think we're living a unique time, but it's going to change. Without continued massive funding and a pivot to sustainable, things will (and should) change.

Don't get me wrong, general intelligence will always be important and should be a part of specialist models to a degree for understanding, but it doesn't make sense to use an 800B+ parameter model to help write an email or do research on company trends. Hell, look at what China has been able to do. Qwen 3.5 9B, exceeds Claude 3.5 Haiku and nears Sonnet 3.5 levels. The 27B variation of Qwen 3.5 is superior to both in many ways and even rivals newer models. There is obviously an inherit lag behind, but we will gradually see a shift as these models become more capable.

Right now we are chasing 1-2% improvements at the cost of billions. Local are already absurdly capable (more and more by the day - same with cloud ofcourse) and smarter than most people in specific areas. To do most jobs, can we honestly say it requires a PhD or higher level understanding to perform? We're chasing something that is becoming more and more not needed from a general day to day perspective. AGI is outstanding, but not practical (at least today). I think we'll get there anyway at our current trajectory (though dangerous), but I suspect things will shift.