Comment by jeffbee

18 days ago

"Intel CPUs were downclocking their frequency when using AVX-512 instructions due to excessive energy usage (and thus heat generation) which led to performance worse than when not using AVX-512 acceleration."

This is an overstatement so gross that it can be considered false. On Skylake-X, for mixed workloads that only had a few AVX-512 instructions, a net performance loss could have happened. On Ice Lake and later this statement was not true in any way. For code like ChaCha20 it was not true even on Skylake-X.

This was written in the past tense, and was true in the last decade. Only recently Intel came up with proper AVX-512

  • "Recently" is 6 years ago, so not so recent.

    The real Intel mistake was that they have segregated by ISA the desktop/laptop CPUs and the server CPUs, by removing AVX-512 from the former, soon after providing decent AVX-512 implementations. This doomed AVX-512 until AMD provided it again in Zen 4, which has forced Intel to eventually reintroduce it in Nova Lake, which is expected by the end of this year.

    Even the problems of Skylake Server and of its derivatives were not really caused by their AVX-512 implementation, which still had a much better energy efficiency than their AVX2 implementation, but by their obsolete implementation for varying the supply voltage and clock frequency of the CPU, which was far too slow, so it had to use an inappropriate algorithm in order to guarantee that the CPUs are not damaged.

    The bad algorithm for frequency/voltage control was what caused the performance problems of AVX-512 (i.e. just a few AVX-512 instructions could lower preventively the clock frequency for times comparable with a second, because the CPU feared that if more AVX-512 instructions would come in the future it would be impossible to lower the voltage and frequency fast enough to prevent overheating).

    The contemporaneous Zen 1 had a much more agile mechanism for varying supply voltage and clock frequency, which was matched by Intel only recently, many years later.

  • It wasn't. My comment covers the entire history of the ISA extension on Intel Xeon CPUs.

I netted huge performance wins out of AVX512 on my Skylake-X chips all the time. I'm excited about less downclocking and smarter throttling algorithms, but AVX512 was great even without them -- mostly just hampered by poor hardware availability, poor adoption in software, and some FUD.