Comment by ben-schaaf

1 day ago

> x86-64 is typically limited to about 5 instructions

Intel Lion-cove decodes 8 instructions per cycle and can retire 12. Intel Skymont's triple decoder can even do 9 instructions per cycle and that's without a cache.

AMD's Zen 5 on the other hand has a 6K cache for instruction decoding allowing for 8 instructions per cycle, but still only a 4-wide decoder for each hyper-thread.

And yet AMD is still ahead of intel in both performance and performance-per-watt. So maybe this whole instruction decode thing is not as important as people are saying.