← Back to context

Comment by mgaunard

2 days ago

Running more code per unit of data does not make the code hotter or reduce the register pressure, quite the opposite...

You’re misunderstanding: you just convert to 32 bits once and reuse that same register all the time.

You’re running the exact same code, but are more more efficient in terms of “I immediately use the data for comparison after converting it”, which means it’s likely either in a register or L1 cache already.