Comment by ACCount37
1 day ago
Nah, that's just plain wrong.
First, GPGPU is powerful and flexible. You can make an "AI-specific accelerator", but it wouldn't be much simpler or much more power-efficient - while being a lot less flexible. And since you need to run traditional graphics and AI workloads both in consumer hardware? It makes sense to run both on the same hardware.
And bandwidth? GPUs are notorious for not being bandwidth starved. 4K@60FPS seems like a lot of data to push in or out, but it's nothing compared to how fast modern PCIe 5.0 x16 goes. AI accelerators are more of the same.
GPUs might not be bandwidth starved most of the time, but they absolutely are when generating text from an llm. It’s the whole reason why low precision floating point numbers are being pushed by nvidia.
That's memory bandwidth, not I/O. Unless your LLM doesn't fit into VRAM.