← Back to context

Comment by p_l

14 hours ago

Even many of the more complex instructions often can translate into surprisingly short sequences - all sorts of loop structures have now various kinds of optimizations including instruction fusion that probably would not be necessary if we didn't stop using higher level LOOP constructs ;-)

But for example REP MOVS now is fused into equivalent of using SSE load-stores (16 bytes) or even AVX-512 load stores (64 bytes).

And of course equivalent of LEA by using ModRM/SIB prefixes is pretty much free with it being AFAIK handled as pipeline step