Comment by saagarjha 20 days ago Isn’t k-means memory bandwidth bound? What was the arithmetic intensity of the final code? 2 comments saagarjha Reply shihab 20 days ago No. Assuming `k` is small enough, which in practice often is, the arithmetic intensity of this kernel is 25-90 Flops/Byte, way above the roofline knee of any modern CPU. NohatCoder 20 days ago I assume that the image would at least fit in L3.
shihab 20 days ago No. Assuming `k` is small enough, which in practice often is, the arithmetic intensity of this kernel is 25-90 Flops/Byte, way above the roofline knee of any modern CPU.
No. Assuming `k` is small enough, which in practice often is, the arithmetic intensity of this kernel is 25-90 Flops/Byte, way above the roofline knee of any modern CPU.
I assume that the image would at least fit in L3.