Comment by bigyabai

1 year ago

> It’s faster at AI than an Nvidia RTX4090, because 96GB of the 128GB can be allocated to the GPU memory space

I love AMD's Ryzen chips and will recommend their laptops over an Nvidia model all day. However, this is a pretty facetious comparison that falls apart when you normalize the memory. Any chip can be memory bottlenecked, and if we take away that arbitrary precondition the Strix Halo gets trounced in terms of compute capacity. You can look at the TDP of either chip and surmise this pretty easily.

> However, this is a pretty facetious comparison that falls apart when you normalize the memory

Why would you normalize though? You can't buy a 96 GB RTX4090. So it's fair to compare the whole deal, slowish APU with large RAM versus very fast GPU with limited RAM.

“ AMD also claims its Strix Halo APUs can deliver 2.2x more tokens per second than the RTX 4090 when running the Llama 70B LLM (Large Language Model) at 1/6th the TDP (75W).”

https://www.tomshardware.com/pc-components/cpus/amd-slides-c...

You could argue it’s invalid claim because it’s from AMD not independent.

  • This is still a memory-constrained benchmark. The smallest Llama 70B model (gguf-q2) doesn't fit in-memory so is bottlenecked by your PCIe connector. It's a valid benchmark, but it's still guilty of being stacked in the exact way I described before.

    A comparison of 7B/13B/32B model performance would actually test the compute performance of either card. AMD is appealing to the consumers that don't feel served by Nvidia's gaming lineup, which is fine but also doomed if Nvidia brings their DGX Spark lineup to the mobile form factor.