Comment by drob518

7 hours ago

Have you used local AI models on a 32 GB MBP? I ask because I'm looking to finally upgrade my M1 Air, which I love, but which only has 16 GB RAM. I'm trying to figure out if I just want to bump to 32 GB with the M5 MBAir or make the jump all the way to 64 GB with the low-end M5 MBP. I love my M1 Air and I don't typically tax the CPU much, but I'm starting to look at running local models and for that I'd like faster and bigger. But that said, I don't want to overpay. Memory is my main issue right now. Anyway, if you have experience, I'd love to hear it. Which MBP, stats of the system, which AI model, how fast did it go, etc?

For local models are you wanting to do:

A) Embeddings.

B) Things like classification, structured outputs, image labelling etc.

C) Image generation.

D) LLM chatbot for answering questions, improving email drafts etc.

E) Agentic coding.

?

I have a MBP with M1 Max and 32GB RAM. I can run a 20GB mlx_vlm model like mlx-community/Qwen3.5-35B-A3B-4bit. But:

- it's not very fast

- the context window is small

- it's not useful for agentic coding

I asked "What was mary j blige's first album?" and it output 332 tokens (mostly reasoning) and the correct answer.

mlx_vlm reported:

  Prompt: 20 tokens @ 28.5 t/s | Generation: 332 tokens @ 56.0 t/s | Peak memory: 21.67 GB