Comment by FarmerPotato

1 year ago

You have to consider that modern CPUs don't execute code in-order, but speculatively, in multiple instruction pipelines.

I've used Intel's icc compiler and profiler tools in an iterative fashion. A compiler like Intel's might be made to profile cache misses, pipeline utilization, branches, stalls, and supposedly improve in the next compilation.

The assembly programmer has to consider those factors. Sure would be nice to have a computer check those things!

In the old days, we only worried about cycle counts, wait states, and number of instructions.