← Back to context

Comment by arjie

15 hours ago

I was looking into this for LLMs but it's clearly a graphics-processing focused card. The memory bandwidth is too low for that much RAM to be useful in an LLM context. The 5090 I have has the same amount of RAM but far more bandwidth and that makes it much more useful.

Oh wow, I really would've expected higher memory bandwidth. That's only ~2-3x the little DGX Spark-alike I have to play with. Would've expected more.

> it's clearly a graphics-processing focused card.

Yes, that's what the G in GPU stands for. It's great to see that there are still manufacturers that understand this.

It’s 32gb for people who can’t go for scalped 5090s but have a 3090 budget.

I have a pair of them with a 9480 and the only thing I have to do is keep the cache happy.

  • Eh. Trading CUDA for 8 more gigs seems like bad deal, unless you know absolutely for certain what you want to run will run on it.

    • Until NVidia prices get better, I’ll build out with the Intel stack and keep the cache (and prompt processing speeds) happy.

      As for software, anything that has a SYCL or Vulkan backend, and/or can be Intel optimized (especially to the same degree as llama.cpp) can run well.