Comment by xlii

10 hours ago

I wonder if this can be categorized as galactic algorithm. I can't imagine systems where bulk of processing goes into integer to decimal string conversion but maybe there are such.

https://en.wikipedia.org/wiki/Galactic_algorithm

My understanding of a Galactic Algorithm is that it has better performance scaling based on input size/complexity, but its overhead is such that it will not actually be faster unless you use it for impracticality large inputs.

I don’t think it has much to do with the case of an algorithm that offers a faster solution to a problem that is rarely a bottleneck (not sure if that’s true in this case anyway).

It takes a substantial amount of time when emitting lots of numbers in JSON, happens very commonly.

And this algorithm has low constant costs, and does not take dramatically more icache than the simple versions. There is no reason not to use this if your compile target can handle avx-512.

I don't know about this specifically, but I've seen a lot of big data jobs where 99% of the CPU was spent in JSON ser/deser. This might be a reasonable chunk of it.

export to large files that represent numbers in textual format for instance ? this can be the difference between "waiting 10 seconds when hitting ctrl-s" and "the software saving automatically on each change because it's unnoticeable"

> An example of a galactic algorithm is the fastest known way to multiply two numbers,[4] which is based on a 1729-dimensional Fourier transform.[5]

That number looked familiar, and yep it's the taxicab number. Coincidence? Neither of the two references seems to mention it

We used it for payment processing. We got huge CSVs from various APIs and used string decimals for computing to avoid overflows/underflows and rounding errors.

I always use binary interchange formats between programs so I am not familiar with the overhead caused by format conversions. Even when displaying numbers for reading them, in the case of floating-point numbers that are displayed in the "scientific" format, i.e. with exponents, I prefer to have only the exponent as a decimal number, but the significand as a hexadecimal number. So I do not need fast algorithms for number conversions.

Nonetheless, there are plenty of people who advocate the use of JSON, XML and similar formats, in which case I assume that number conversions can take a non-negligible time, which might be decreased by such fast algorithms.

  • You know, if can change code without overhead to ends of the pipeline, using the language & library of my choice, I’d do this too. For many of us this isn’t always the case.

It’s faster for 3 digits and more. 3 digits is not galactic scale. Otoh, if over half of your numbers are single digits, it will lose to other implementations. I think that is more often the case that we’d like it to be.