← Back to context

Comment by thaumasiotes

4 years ago

> Everything is O(1) if N is constant, including log(N), N^2, 2^N, N!, etc.

Not even close. 2^k is not O(1) by virtue of N being constant. Only 2^N.

This has been covered above. It is more common to consider the complexity of hash table operations in terms of the number of operations, or the size of the table; the size of the key is very often constant. These are different variables; the constant size of the key does not trivialize the complexity of inserting N items each with a constant key size.

Here, the relevant key is the output of the hash function though -- that's what you need to increase in order to ensure you can reach all buckets. And that (k) must increase with the table size. So it is not constant and depends on n (table size).

Earlier discussion: https://news.ycombinator.com/item?id=9807739

  • I remember a proof in CLRS which first developed a function that was bounded above by 5 for all conceivable input ("a very quickly-growing function and its very slowly-growing inverse"), and then substituted the constant 4 or 5 into a complexity calculation in place of that function, giving a result which was "only" correct for all conceivable input.

    The same approach applies to key length requirements for hash tables with arbitrarily large backing stores. They do not grow as slowly as the CLRS log* function, but they grow so slowly that there are easily identifiable sharp limits on how large they can be -- an easy example is that a hash table cannot use more memory than the hardware offers no matter how the software is written. A backing store with 1TB of addressable bytes cannot need the key to be more than 40 bits long.

    On a different note, by "table size" in my earlier comment I meant to refer to the number of entries in the table, not the capacity of the backing store. It seems like you might be using the same word for a different concept?

    • >The same approach applies to key length requirements for hash tables with arbitrarily large backing stores. They do not grow as slowly as the CLRS log* function, but they grow so slowly that there are easily identifiable sharp limits on how large they can be -- an easy example is that a hash table cannot use more memory than the hardware offers no matter how the software is written. A backing store with 1TB of addressable bytes cannot need the key to be more than 40 bits long.

      So? That's still putting a bound on table size, which makes it in-practice constant, but doesn't make the algorithm O(1), because you can never get such a result by bounding n, for the reasons the GGP gave -- that's cheating.

      Your complexity bound has to be written on the assumption that n (number of elements to store in hashtable) increases without bound. Assuming you will never use more that Y bytes of data is not valid.

      >On a different note, by "table size" in my earlier comment I meant to refer to the number of entries in the table, not the capacity of the backing store. It seems like you might be using the same word for a different concept?

      No, I was using table size exactly as you, to mean the number of elements stored. Is there a reason my comments only made sense under a different definition? It not, be charitable. (And avoid using obscure terms.)

      2 replies →

    • By that logic, any O(logN) factor is simply O(1). That O(NlogN) sort algorithm should just be considered O(N) for all practical purposes.

  • Incidentally, the person replying to you in that thread incorrectly stated that comparison is O(logN) on the number of bits. The most common comparison function, lexicographic comparison, is actually O(1) average case given random inputs of arbitrary length.