Comment by rafram
13 hours ago
But where are you going to find an Nvidia GPU with 128+ GB of memory at an enthusiast-compatible price?
13 hours ago
But where are you going to find an Nvidia GPU with 128+ GB of memory at an enthusiast-compatible price?
You don’t need it if you use llamacpp on Windows, or if you compile it on Linux with CUDA 13 and the correct kernel HMM support, and you’re only using MoE models (which, tbh, you should be doing anyways).
What MoE has to do with it? Aside from Flash-MoE that supports exactly one model and only on macOs - you still need to load entire model into memory. You also don't know what experts going to be activated, so it's not like you can predict which needs to be loaded.
That might even be true, but how large is the TAM for such machines?
Some Chinese sources sell modded Nvidia GPUs with extra VRAM. They're quite affordable in comparison to even a Mac Pro.
Any links to them? Never heard of this..
I've seen a guy who sells modded 2080 Ti with 22gb for $500
https://www.tomshardware.com/pc-components/gpus/chinese-work...
There's also unreleased Nvidia engineering samples of cards with doubled VRAM like this - https://www.reddit.com/r/nvidia/comments/1rczghu/update_unre...
It’s been going on for a while. Search YouTube or the web for 48gb 4090 (this is one of the most popular modded Nvidia cards), Nvidia of course never officially made a 4090 with this much memory.
There are some on sale via eBay right now. The memory controllers on some Nvidia gpus support well beyond the 16-24gb they shipped with as standard, and enterprising folks in China desolder the original memory chips and fit higher capacity ones.
Go at ebay and search for RTX 4090 48GBs. There's plenty of them with prices around $3.5k
And how much do you trust Chinese hardware?
Give that most of mine, and probably yours, and probably most of the world's computers are in fact made in China one way or another, some higher percentage than others, I'm guessing most of us trust our hardware enough to continue using it.
When there's no one left to trust, maybe you need to re-evaluate your criteria.
3 replies →
The Mac is also chinese hardware
It would be hilarious if you are using a Lenovo device right now.
and let alone competing on the energy consumption!
The Nvidia DGX Spark is exactly this and in the same price and performance bracket.
Sadly, memory bandwidth is abysmal compared to Apple chips - 273 GB/s vs 614 GB/s on M5 Max for similar price. Even though fp4 compute is faster, it doesn't help for all the decode heavy agentic workflows.
You can still buy used 3090 cards on ebay. 5 of them will give you 120GB of memory and will blow away any mac in terms of performance on LLM workloads. They have gone up in price lately and are now about $1100 each, but at one point they were $700-800 each.
I don't see how 5x 3090's is a better option than an M3 Ultra Mac studio.
The mac will just work for models as large as 100B, can go higher with quantized models. And power draw will be 1/5th as much as the 3090 setup.
You can certainly daisy chain several 3090's together but it doesn't work seamlessly.
> You can certainly daisy chain several 3090's together
It's not "daisy chaining" 3090 has NVLink.
2 replies →
> The mac will just work for models as large as 100B, can go higher with quantized models. And power draw will be 1/5th as much as the 3090 setup.
This setup will work for 100B models as well. And yes, the Mac will draw less power, but the Nvidia machine will be many times faster. So depending on your specific Mac and your specific Nvidia setup, the performance per watt will be in the same ballpark. And higher absolute performance is certainly a nice perk.
> You can certainly daisy chain several 3090's together but it doesn't work seamlessly.
Citation needed; there's no "daisy chaining" in the setup I describe, and low level libraries like pytorch as well as higher level tools like Ollama all seamlessly support multiple GPUs.
4 replies →
Where are you gonna find Apple hardware with 128GB of memory at enthusiast-compatible price?
The cheapest Apple desktop with 128GB of memory shows up as costing $3499 for me, which isn't very "enthusiast-compatible", it's about 3x the minimum salary in my country!
Apple is not catering to minimum salaries in poor countries. Does this really need to be explained?
$3499 is definitely enthusiast compatible. That's beefy gaming PC tier, which is possibly the canonical example of an enthusiast market.
This isn't tens of thousands of dollars for top tier Nvidia chips we're talking about.
Seems I misunderstood what a "enthusiast" is, I thought it was about someone "excited about something" but seems the typical definition includes them having a lot of money too, my bad.
20 replies →
1200$ as the minimum salary covers probably 70% of Europe by population?
2 replies →
Did you need to add poor? Unless apple isn't catering to the US
I spent aaround that on my current personal desktop... 9950X, 2x48gb ddr5/@6000, RX 9070XT, 4tb gen 5 nvme + 4tb gen 4 nvme. I could have cut the cpu to a 9800x3d and ram to 32gb with a different GPU if my needs/usage were different. I'm running in Linux and don't game too much.
That said, a higher end gaming setup is going to cost that much and is absolutely in the enthusiast realm. "enthusiast" doesn't mean compatible with "minimum wage"
The original Mac with 128KB of memory cost $2,495 when Apple released it in 1984. It would be about 3x that in today's money.
I came here to say the same. Even with my student discount price of $1000, that's over 3K in today's dollars.
We are so freaking spoiled by the cheap cost of compute now.
> it's about 3x the minimum salary in my country!
Enthusiast compute hardware doesn't cater to the people on the minimum salary in any country, let alone developing nations. When Ferrari makes a car they don't ask themselves if people on minimum salary will be able to afford them.
In in the bottom two poorest EU member states and Apple and Microsoft Xbox don't even bother to have a direct to customer store presence here, you buy them from third party retailers.
Why? Probably because their metrics show people here are too poor to afford their products en-masse to be worth operating a dedicated sales entity. Even though plenty of people do own top of the line Macbooks here, it's just the wealthy enthusiast niche, but it's still a niche for the volumes they (wish to)operate at. Why do you think Apple launched the Mac Neo?
Right, I think maybe we're then talking about "upper class enthusiasts" or something in reality then? I understood that to juts be about the person, not what economic class they were in, maybe I misunderstood.
4 replies →