Comment by immibis
4 days ago
Indeed, scheduling instructions into parallel-compatible aligned blocks is menial work that's usually best done by a machine; each CPU has different preferences, so it only works well if the machine knows which kind of CPU the code will actually run on.
Eigen certainly uses a bunch of optimizations, including SIMD, but also things like FFTs and matrix decompositions.
No comments yet
Contribute on Hacker News ↗