Comment by StilesCrisis 36 minutes ago It's just multiplication. Floating multiply is extraordinarily fast. 2 comments StilesCrisis Reply lacedeconstruct 29 minutes ago The difference between 20 cycles and 1 clock cycle in a hot loop is very noticeable Sesse__ 7 minutes ago Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)
lacedeconstruct 29 minutes ago The difference between 20 cycles and 1 clock cycle in a hot loop is very noticeable Sesse__ 7 minutes ago Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)
Sesse__ 7 minutes ago Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)
The difference between 20 cycles and 1 clock cycle in a hot loop is very noticeable
Useful, then, that you can start several vectorized floating-point muls each cycle. (E.g., most modern x86 are 3/0.5 cycles for vmulps. No 20 cycles in sight.)