Comment by Tepix
8 days ago
Have you thought about getting a second 128GB device? Open weights models are rapidly increasing in size, unfortunately.
8 days ago
Have you thought about getting a second 128GB device? Open weights models are rapidly increasing in size, unfortunately.
Considered getting a 512G mac studio, but I don't like Apple devices due to the closed software stack. I would never have gotten this Mac Studio if Strix Halo existed mid 2024.
For now I will just wait for AMD or Intel to release a x86 platform with 256G of unified memory, which would allow me to run larger models and stick to Linux as the inference platform.
I aspire to casually ponder whether I need a $9,500 computer to run the latest Qwen model
You'll need more since RAM prices are up thanks to AI.
Given the shortage of wafers, the wait might be long. I am however working on a bridging solution. Sime already showed Strix Halo clustering, I am working on something similar but with some pp boost.
Unfortunately, AMD dumped a great device with unfinished software stack, and the community is rolling with it, compared to the DGX Spark, which I think is more cluster friendly.