Comment by dragontamer
2 days ago
I assume most people learn microarchitecture for performance reasons.
At which point, the question you are really asking is what aspects of assembly are important for performance.
Answer: there are multiple GPU Matrix Multiplication examples covering channels (especially channel conflicts), load/store alignment, memory movement and more. That should cover the issue I talked about earlier.
Optimization guides help. I know it's 10+ years old, but I think AMDs OpenCL optimization guides was easy to read and follow, and still modern enough to cover most of today's architectures.
Beyond that, you'll have to see conferences about DirectX12 new instructions (wave instructions, ballot/voting, etc. etc) and their performance implications.
It's a mixed bag, everyone knows one or two ways of optimization but learning all of them requires lots of study.
No comments yet
Contribute on Hacker News ↗