Comment by david-gpu
17 hours ago
Look at my user profile. Divergence in modern NVidia GPUs does not work the way you think it does. A separate program counter per thread does not mean that on each clock each thread is issuing a different instruction. See section 3.2.2.1. of https://docs.nvidia.com/cuda/cuda-programming-guide/03-advan...
Of course divergence is sometimes unavoidable. That is why GPUs support it. But substantially divergent code comes at a significant cost.
No comments yet
Contribute on Hacker News ↗