← Back to context

Comment by pjc50

2 years ago

> Int to string is especially egregious because it requires a lot of division which is basically the slowest (common) thing you can do on a CPU.

Yeah. There's some great benchmarking and tips: https://www.zverovich.net/2013/09/07/integer-to-string-conve...

Godbolt example: https://godbolt.org/z/M4b353PKv of which the meat is

        imul    rcx, rcx, 1374389535
        shr     rcx, 37
        imul    edi, ecx, 100
        mov     r8d, edx
        sub     r8d, edi
        movzx   edi, WORD PTR .LC2[r8+r8]
        mov     WORD PTR [rsi], di

So if our example input is 12345, the first two instructions use the field-inverse property to compute "123" with multiply rather than divide (1 clock, latency 4), then multiply up again to get "12300", then subtract that to get "45". That can then be looked up at position 45+45 in the string "00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899" to give you the two digits.