Comment by andrewstuart

1 year ago

AMD Strix Halo APU is a CPU with very powerful integrated GPU.

It’s faster at AI than an Nvidia RTX4090, because 96GB of the 128GB can be allocated to the GPU memory space. This means it’s doesn’t have the same swapping/memory thrashing that a discrete GPU experiences when processing large models.

16 CPU cores and 40 GPU compute units sounds pretty parallel to me.

Doesn’t that fit the bill?

8 comments

andrewstuart

simne 1 year ago

> It’s faster at AI than an Nvidia RTX4090, because 96GB of the 128GB can be allocated to the GPU memory space

No definitely. RTX4090 definitely use fast graphics RAM (though it is usually previous generation, but overclocked and very wide bus). AMD Strix Halo definitely use standard DDR5 which is not so fast.

And yes, Strix Halo GPU using "3dcache", but as officials said, CPU don't have access to GPU cache, because "have not seen any app significantly benefited from such access".

So probably, internal SoC bus should have less delay than discrete GPU on PCIe, but not too much different.

dr_kiszonka 1 year ago

It looks like it will be available in the Framework Desktop! I would love to see it in a more budget mini PC at some point from another company. (Framework is great but not in my price range.)

bigyabai 1 year ago

> It’s faster at AI than an Nvidia RTX4090, because 96GB of the 128GB can be allocated to the GPU memory space

I love AMD's Ryzen chips and will recommend their laptops over an Nvidia model all day. However, this is a pretty facetious comparison that falls apart when you normalize the memory. Any chip can be memory bottlenecked, and if we take away that arbitrary precondition the Strix Halo gets trounced in terms of compute capacity. You can look at the TDP of either chip and surmise this pretty easily.

dist-epoch 1 year ago
> However, this is a pretty facetious comparison that falls apart when you normalize the memory
Why would you normalize though? You can't buy a 96 GB RTX4090. So it's fair to compare the whole deal, slowish APU with large RAM versus very fast GPU with limited RAM.
- Animats 1 year ago
  
  > You can't buy a 96 GB RTX4090
  You can now buy a 96 GB RTX5090.[1] NVidia gives it a "Pro" designation and charges more, but it's the same chip.
  [1] https://www.tomshardware.com/pc-components/gpus/nvidia-rtx-p...
- bigyabai 1 year ago
  
  It is fair, it should just be contextualized with a comparison of 13B or 32B models as well. This is one of those Apple marketing moves where a very specific benchmark has been cherry-picked for a "2.2x improvement!" headline that people online misconstrue.
andrewstuart 1 year ago
“ AMD also claims its Strix Halo APUs can deliver 2.2x more tokens per second than the RTX 4090 when running the Llama 70B LLM (Large Language Model) at 1/6th the TDP (75W).”
https://www.tomshardware.com/pc-components/cpus/amd-slides-c...
You could argue it’s invalid claim because it’s from AMD not independent.
- bigyabai 1 year ago
  
  This is still a memory-constrained benchmark. The smallest Llama 70B model (gguf-q2) doesn't fit in-memory so is bottlenecked by your PCIe connector. It's a valid benchmark, but it's still guilty of being stacked in the exact way I described before.
  A comparison of 7B/13B/32B model performance would actually test the compute performance of either card. AMD is appealing to the consumers that don't feel served by Nvidia's gaming lineup, which is fine but also doomed if Nvidia brings their DGX Spark lineup to the mobile form factor.