Comment by dekhn

8 years ago

can any hash table whose hash function operates over the entire string ever show O(1) time?

The O(1) amortized access of a hash table is for the number of entries, not key hashing or comparison. Doubling the number of entries does not double the retrieval time.

As a hash table gets larger, the cost of key hashing (for the same domain of keys) does not increase. Of course hashing and comparison performance can still be important, it's just not what is generally being analyzed with basic complexity theory.

  • I guess I get the idea that the O(1) is for the number of entries, but in my experience 99% of prod time is spent hashing keys on amortized O(1) systems, so focusing on the basic complexity theory would seem to be leaving a lot of performance on the table and also contributes confusion.

> can any hash table whose hash function operates over the entire string ever show O(1) time?

If we limit the length of the has key, the time taken by the hash function is also limited and turns into a constant. Big-O is all about asymptotic behavior.

OTOH noticing how the hash function's run time depends on the size of the key can be separately useful. I suspect all such functions are linear in practice, though.

  • I'm talking about functions where the entire string is used to that's necessary on the internet when dealing with billions of strings that are similar to each other