Comment by Nevermark
20 days ago
I agree. I also think we have only hit the surface of model efficiencies.
Apple's M3 Ultra with RAM up to 512GB shared directly across CPU/GPU/NPUs is a great example of an architecture already optimized for local models. I expect Apple will start offering larger RAM sizes for other form factors too.
And prices for RAM will drop eventually, because of the extreme demand for RAM with higher densities.
It reminds me of the huge infra investments in Sun and Cisco during the first .com boom, and then 5-10 years later those fancy Sun boxes were out performed by Grandma's Windows XP box.