← Back to context

Comment by groby_b

4 years ago

The difficulty here is "define n". And I don't mean that facetiously. You have a string parsing lib. It is, for reasons, quadratic over the number of strings parsed, and linear per string.

This is overall n^3, but that's meaningless because there actually isn't just one n. So, more m^2 * n. That means you can't reduce it to anything, because you want to keep both components. (Because, say, you know it will only ever be called with a single string).

But then, in the next app, this gets called and reinitialized once per file. And the routine handling files, for reasons beyond our ken, is (n lg n). We're now at k * log(k) * m^2 * n.

And so, over any sufficiently long call chain, "what is n" is the overriding question - string length, number of strings, number of files? Not "how complex is the algorithm", because you want to optimize for what's relevant to your use case.

It would be a huge step to simply follow the call tree and report the depth of nested loops for each branch. You could then check what N is at each level.

The trick is knowing where the nested loops are since they can be spread across functions.

I had a function that scaled as N^2, but it was creating a list of that size as well. Then it called a function to remove duplicates from that list. That function was N^2, which meant the whole thing was actually N^4. And now that I think of it, those loops were not nested... I rewrote the first part to no create duplicates and deleted the quadratic deduplication. Now its N^2, but it has to be.

I guess you're right. Keeping track of it all is required for the information to be meaningful enough. Still seems doable to me, assuming the functions are pure.

Here's another crazy idea: keeping track of this while taking into consideration aggressive compiler optimizations.