Comment by johngossman
3 hours ago
Oh boy...more memories. About a decade later at work I identified a bottleneck in our line-drawing code. The final step was to cast two floats (a point) to integers, which the compiler turned into ftoa() calls. Unfortunately, ftoa changed and restored the floating point control flags in order to set the rounding behavior (the Intel default did not match the standard library spec). Even more unfortunately, this stalled the Pentium's instruction pipeline. Replacing the casts with a simple fld/fist pair was another 100x speedup. A few years later I noticed the compilers started adding optimization flags controlling this behavior.
No comments yet
Contribute on Hacker News ↗