Someone needs to try running Crysis on that bad boy using the D3D WARP software rasterizer. No GPU, just an army of CPU cores trying their best. For science.
I wonder what Ampere (mentioned in that article) is going to do. At this rate they’ll need to release a 1000 cpu chip just to be noticeably “different.”
At some point won't the bandwidth requirements exceed the number of pins you can fit within the available package area? Presumably you'll end up back at a low maximum memory high bandwidth GPU design.
I wonder how many of these you could cram into 1U? And what the maximum next gen kW/U figure looks like.
EDIT: actually, now that I think about it some more, my characterization of Zen-C cores as the same "idea" as Intel E-cores was pretty unfair too; they do serve the same market idea but the implementation is so much less silly that it's a bit daft to compare them. Intel E-Cores have different IPC, different tuning characteristics, and different feature support (ie, they are usually a different uarch) which makes them really annoying to deal with. Zen C cores are usually the same cores with less cache and sometimes fewer or narrower ports depending on the specific configuration.
Does it actually scale well to that many cores? If so, that's quite impressive; most video game simulations of that kind benefits more from few fast cores since parallelizing simulations well is difficult
these big high-core systems do scale, really well, on the workloads they're intended for. not games, desktops, web/db servers, lightweight stuff like that. but scientific, engineering - simulations and the like, they fly! enough that the HPC world still tends to use dual-socket servers. maybe less so for AI, where at least in the past, you'd only need a few cores per hefty GPU - possibly K/V stuff is giving CPUs more to do...
Intel's Clearwater Forest could be shipping even sooner, 288 cores. https://chipsandcheese.com/p/intels-clearwater-forest-e-core...
It's a smaller denser core but still incredibly incredibly promising and so so neat.
Someone needs to try running Crysis on that bad boy using the D3D WARP software rasterizer. No GPU, just an army of CPU cores trying their best. For science.
This has already been tried :)
iirc, in the 2016 a quadcore intel cpu ran the original crysis at ~15fps
I wonder what Ampere (mentioned in that article) is going to do. At this rate they’ll need to release a 1000 cpu chip just to be noticeably “different.”
At some point won't the bandwidth requirements exceed the number of pins you can fit within the available package area? Presumably you'll end up back at a low maximum memory high bandwidth GPU design.
I wonder how many of these you could cram into 1U? And what the maximum next gen kW/U figure looks like.
Unfortunately Ampere has fallen pretty far behind AMD. I don't see much point to their recent CPUs.
"E-cores" are not the same
The 32 core / die AMD products are almost certainly Zen 6c, which is the same "idea" as Intel E-Cores albeit way less crappy.
https://www.techpowerup.com/forums/threads/amd-zen-6-epyc-ve...
EDIT: actually, now that I think about it some more, my characterization of Zen-C cores as the same "idea" as Intel E-cores was pretty unfair too; they do serve the same market idea but the implementation is so much less silly that it's a bit daft to compare them. Intel E-Cores have different IPC, different tuning characteristics, and different feature support (ie, they are usually a different uarch) which makes them really annoying to deal with. Zen C cores are usually the same cores with less cache and sometimes fewer or narrower ports depending on the specific configuration.
6 replies →
By what logic?
9 replies →
Ah, I omitted to mention that with 256 cores, you get 512 threads.
32 cores on a die, 256 on a package. Still stunning though
How do people use these things? Map MPI ranks to dies, instead of compute nodes?
Yeah, there's an option to configure one NUMA node per CCD that can speed up some apps.
MPI is fine, but have you heard of threads?
7 replies →
640 cores should be enough for anyone
Tell that to Nvidia, Blackwell is already up to 752 cores (each with 32-lane SIMD).
640K cores should be enough for everyone.
b200 is 148 sms, so no
3 replies →
That's going to run Cities Skylines 2 ~~really really well~~ as well as it can be run.
Does it actually scale well to that many cores? If so, that's quite impressive; most video game simulations of that kind benefits more from few fast cores since parallelizing simulations well is difficult
No, see https://m.youtube.com/watch?v=44KP0vp2Wvg . You're right it didn't scale that well
1 reply →
these big high-core systems do scale, really well, on the workloads they're intended for. not games, desktops, web/db servers, lightweight stuff like that. but scientific, engineering - simulations and the like, they fly! enough that the HPC world still tends to use dual-socket servers. maybe less so for AI, where at least in the past, you'd only need a few cores per hefty GPU - possibly K/V stuff is giving CPUs more to do...
6 replies →
Nope, see https://m.youtube.com/watch?v=44KP0vp2Wvg . Just didn't scale enough
I’m gonna get one of these and I’m just gonna play DOOM on it.