Comment by TeMPOraL

4 years ago

I agree. And I'd personally probably trip over it too, if I used C functions much (I use C++ standard library equivalents, and I'm going to recheck string parsing code in our project now, because we may have that same problem, just with different API!). strlen() in a loop is something that can sneak up on you by virtue of having too many layers of abstraction - scanf() family being one of them.

But what I'm saying is, documenting big O complexity is useful, even if imperfect (and if the function has "tricky" complexity, the conditions where it gets bad should be documented too!).

> Sscanf could be O(N^10) (does the rule about ignoring the coefficient also apply to the exponent?) but if it only starts to hurt your application at the point of 10tb string reads then it's still unimportant.

Sure, but then applications grow, functions get reused. Iff it's trivial to spot and replace O(N^10) Sscanf with a O(n) alternative, I'd want to know the complexity and do the replacement immediately - otherwise, you may discover two years later that the company is investing employee-years into horizontally scaling your application as it's slow under workload, where fixing that one thing would let it run that same workload on a laptop, on a single core.

(This is not a joke, there are "big data" stories like this.)

1 comment

TeMPOraL

beaconstudios 4 years ago

> But what I'm saying is, documenting big O complexity is useful, even if imperfect (and if the function has "tricky" complexity, the conditions where it gets bad should be documented too!).

Ah yes I totally agree on this point, it should be documented for those who are more cautious/mathematical in their engineering efforts. I'm just not one of those people! I prefer to focus on the structural behaviour of a team (if it's my responsibility) to ensure that cases like:

> two years later that the company is investing employee-years into horizontally scaling your application as it's slow under workload

Resolve correctly - if a team is set up to respond healthily to production slowdowns (of course the equation is different for software with a slow/nonexistent update loop) then IMO you don't need to invest as heavily into up-front optimisation and can instead invest into features, allowing (sample-based for busy paths) profiling in production to notify you when optimisation is warranted.

At the end of the day, up front optimisation is an exchange of time with your future self/colleagues. It's worth it in some cases, not in others - but knowing where the balance in a tradeoff lies in your circumstance is a good chunk of all engineering decisions!