← Back to context

Comment by gridspy

5 years ago

quadratic is a fancy way of saying "this code is super fast with no data, super slow once you have a decent amount"

The problem is that when you double the amount of stuff in the JSON document, you quadruple (or more) the scanning penalty in both the string and the list.

Why quadruple? Because you end up scanning a list which is twice as long. You have to scan that list twice as many times. 2x2 = 4. The larger list no longer fits in the fast (cache) memory, among other issues. The cache issue alone can add another 10x (or more!) penalty.

> quadratic is a fancy way of saying "this code is super fast with no data, super slow once you have a decent amount"

Well, that is an abuse of the term, by people that sometimes don't actually know what that really means. Up to a point, quadratic IS faster than linear after all for example. Too many developer love too abuse the word blindly.

If it is badly tested with no data, it is badly tested with no data. Period. Not "quadratic".

> The problem is that when you double the amount of stuff in the JSON document, you quadruple (or more) the scanning penalty in both the string and the list.

My point was precisely it depends on the data and initial assumption are to be routinely revised. I was making a general point.

Maybe the guy was pinky-sworn that the JSON would hardly change and that the items were supposed to be ordered, sequential and no more than 101. For all you know it is even documented and nobody cared/remembered/checked when updating the JSON. But we don't know, obfuscated code don't comes with comments and context ...

Or, it is actually a real rookie mistake. It probably was, but we don't have all the facts.

  • > Well, that is an abuse of the term, by people that sometimes don't actually know what that really means. Up to a point, quadratic IS faster than linear after all for example. Too many developer love too abuse the word blindly.

    There is absolutely no guarantee that a quadratic algorithm has to be faster than a linear algorithm for small N. It can be, in some situations for some algorithms, but the complexity class of the algorithm has nothing to do with that. A quadratic algorithm may well be slower than a linear algorithm for any N.

    The only thing the complexity class tells us is that starting from some N the linear algorithm is faster than the quadratic algorithm. That N could be 0, it could be 100, or it could be 1 billion.

    In my experience it's usually between 0~1000, but again, that depends. The complexity class makes no such guarantees. The complexity class tells us the general shape of the performance graph, but not exactly where the graphs will intersect: this depends on the constants.

    > If it is badly tested with no data, it is badly tested with no data. Period. Not "quadratic".

    It is both. The problem is that the algorithm has quadratic complexity. The fact that it was badly tested caused this fact to remain hidden while writing the code, and turn into a real problem later on.

  • >Well, that is an abuse of the term, by people that sometimes don't actually know what that really means. Up to a point, quadratic IS faster than linear after all for example. Too many developer love too abuse the word blindly.

    The problem with this argument is that if the data size and constants are sufficiently small enough people don't care about whether the linear algorithm is slow. In the case of JSON parsing the constants are exactly the same no matter what string length algorithm you use. Thus when n is small you don't care since the overall loading time is short anyway. When n is big you benefit from faster loading times.

    I honestly don't understand what goal you are trying to accomplish. By your logic it is more important to keep short loading times short and long loading times long rather than do the conventional engineering wisdom of lowering the average or median loading time which will sometimes decrease the duration of long loading screens at the expensive of increasing the duration of short loading screens.