← Back to context

Comment by Flemlo

5 days ago

I think 384gb of ram is surprisingly reasonable tbh.

200-300$/month are already 7k in 3 years.

And I do expect some hardware chip based models in a few years like a GPU.

AiPU we're you can replace the hardware ai chip.

> I think 384gb of ram is surprisingly reasonable tbh.

> 200-300$/month are already 7k in 3 years.

Except at current crazy rates of improvement, cloud based models will in reality likely be ~50x better, and you'll still have the same system.

  • I've had the same system (M2 64GB MacBook Pro) for three years.

    2.5 years ago it could just about run LLaMA 1, and that model sucked.

    Today it can run Mistral Small 3.1, Gemma 3 27B, Llama 3.3 70B - same exact hardware, but those models are competitive with the best available cloud-hosted model from two years ago (GPT-4).

    The best hosted models (o3, Claude 4, Gemini 2.5 etc) are still way better than the best models I can run on my 3-year-old laptop, but the rate of improvements for those local models (on the same system) has been truly incredible.

  • I'm surprised that it's even possible running big models locally.

    I agree we will see how this plays out but I hope models might start to become more efficient and it might not matter that much for certain things to run some parts locally.

    I could imagine a LLM model with a lot less languages and optimized for one programming language to happen. Like 'generaten your model'

  • Yes LLMs are a funny workload. They require high amounts if processing but are very bursty.

    Therefore using your own bare metal is a low of expensive redundancy.

    For the cloud provider they can utilise the GPU to make it pay. They can also subsidise it with VC money :)