Comment by omneity
10 months ago
128GB but it's not using much.
I'm running Q4 and it's taking 17.94 GB VRAM with 4k context window, 20GB with 32k tokens.
10 months ago
128GB but it's not using much.
I'm running Q4 and it's taking 17.94 GB VRAM with 4k context window, 20GB with 32k tokens.
I am not a mac person, but I am debating buying one for the unified ram now that the prices seem to be inching down. Is it painful to set up? The general responses I seem to get range from "It is takes zero effort" to "It was a major hassle to set everything up."
LM Studio and Ollama are both very low complexity ways to get local LLMs running on a Mac.
As a Python person I've found uv + MLX to be pretty painless on a Mac too.
Read the article you are commenting on. It is a how to that answers your exact question. It takes 4 commands in the terminal.
> I am not a mac person, but I am debating buying one for the unified ram
Soon some AMD Ryzen AI Max PCs will be available, with unified memory as well. For example the Framework Desktop with up to 128 GB, shared with the iGPU:
- Product: https://frame.work/us/en/desktop?tab=overview
- Video, discussing 70B LLMs at around 3m:50s : https://youtu.be/zI6ZQls54Ms
oh boy... i was genuinely hoping someone less pricy would enter this market
edit: ok.. i am excited.
You can use the method in this tutorial or you can download LM Studio and run it.
The latter is super easy. Just download the model (thru the GUI) and go.
The article should answer your question. Or do you mean setting up a Mac for use as a Linux or windows user
<< Or do you mean setting up a Mac for use as a Linux or windows user
This part, yes. I assume the setting a complete environment is a little more involved than the 4 commands sibling is also refers to.
2 replies →
Honestly, it is quite a hastle, took me 2 hours BUT. if you just take the whole article text and paste that to gemini-2.5-pro and give your circumstance, i think it will give you specific steps for your case and it should be trivial from that moment on
Using llama.cpp or pytorch could hardly be easier.