Comment by hedora
4 hours ago
So, they're selling this as an AI accelerator, with drop in compatibility with existing boards, and no boost to RAM bandwidth.
As I understand things, it would be extremely unusual to ship a chip that was bound by floating point throughput, not uncached memory access, especially in the desktop/laptop space.
I haven't been following the Intel server space too carefully, so it's an honest question: Was the old thing compute and not bandwidth limited, or is this going to be running inference at the same throughput (though maybe with lower power consumption)?
No, they're not selling this as an "AI accelerator":
Here is the quote:
"The company says operators deploying 5G Advanced and future 6G networks increasingly rely on server CPUs for virtualized RAN and edge AI inference, as they do not want to re-architect their data centers in a bid to accommodate AI accelerators."
Edge AI usually means very small models that run fine on CPUs.
A very small model is going to be, what, 8GB? That'll easily blow through the caches. You're going to end up bottlenecked on DRAM either way.
So, I wonder if this is going to be any faster than the previous generation for edge AI.
Perhaps instead of posting erroneous assertions to HN you could wander over to your LLM of choice and ask it something along the lines of: What are some examples of edge AI applications that achieve good performance on a CPU where memory bandwidth is severely limited compared to a GPU? Please link to publicly available models where possible.