Comment by cyanydeez

14 hours ago

competitor is already on the market and is x86: AMD AI 395+

bechmarks with DGX arnt spectacular for NVIDIAs software and CUDA lead.

wouldnt count on this being a price/compute challenger. especially with overpriced VRAM.

Strix halo's 8060S gpu is very weak, and is roughly equivalent to a 4060 laptop GPU, whereas GB10's gpu is equivalent to a desktop 5070. For LLM throughput, tok/s is similar due to bottleneck by memory bandwidth, but the GB10 has 3x faster prefill. People have also been able to squeeze out much better performance on GB10 using NVFP4 and other improvements in the months after the DGX Spark launch, so don't be misled by early lackluster benchmarks. For the RTX Spark, which also targets gaming and creative applications, the 3x faster GPU is quite nice.

Or like a m4 max? This thing has <300GB/s vs the max with 550GB/s

All those CUDA cores in the sparks but they're starved for memory bandwidth.

I am still waiting for NVidia to release a system that legit beats 3090 maxxing for the home gamer...

  •   Spark:
      OS: Windows/Ubuntu
      Mbw: 300GB/s
      Cuda cores: 6000
      GPU accelerated containers: yes
    
    
      M5 max:
      OS: macOS
      Mbw: 600GB/s
      Cuda cores: 0
      GPU accelerated containers: no

    • I feel like the shape of the market right now for "home lab" inference is:

      The sparks are good if your ultimate plan is to spend even more on NVidia hardware in future to run your dev setups at usable speeds. Or, you're developing for a work cluster.

      If you mainly want to run local models at acceptable speeds portably, buy a mac with lots of RAM. If you’re happy with non-portable / racked, buy 3090s (dense) or mac studios (MoEs). Buy newer cards if you are restricted on power or slots. If you are rich, buy a6000 blackwells.

The only Question is is it worth suffering hip and x86? I suspect a lot of folks might like a machine that mimics their GB300 But costs less than a dgx.

Also I heard the tensor core instructions on the dgx are gimped and you’re better off with a rtx pro x000. Is that the same with these machines?

Is CUDA really a lead for long? Aren’t all the latest competitive approaches avoiding all the standard software stacks and writing deeply customized software that is very directly tied to whatever hardware they use?

And is it really a way to lock in people? With AI coding tools, isn’t it trivial to write software on top of CUDA and rewrite it to target some other hardware?