Comment by voidmain

8 years ago

The increasing latency [1] of access to larger amounts of storage isn't a constant factor, though, even asymptotically. There was a fun post on HN a while ago arguing that random access latency to N bytes is asymptotically O(N^0.5), from both empirical data about real systems and theoretically from the Bekenstein bound and the speed of light. And even if you don't buy that it clearly can't be less than O(N^1/3). Don't go preaching this in job interviews, though :-)

The truth is that complexity theory is always done relative to some abstract machine model, and as with all models it's up to you to determine if the model is a good fit for your practical reality. "To the extent that mathematics reflects reality it is not certain, and to the extent that it is certain it does not reflect reality." There exist abstract models, like the cache oblivious model, that do a somewhat better job of grappling with this issue but don't require to just throw up your hands and rely solely on benchmarks of real hardware. There is probably room for new theory in this area.

[1] not throughput, which isn't as hard to scale up

8 comments

voidmain

timerol 8 years ago

The article in question: https://news.ycombinator.com/item?id=12383012

It's fun to think about, but the presentation played fast and loose with a few concepts. Sparked some lively discussion because of that. I really like the idea of needing to count random memory accesses as costing more than sequential accesses when analyzing algorithms in a big-O-type notation.

Ar-Curunir 8 years ago

back when HDDs were a thing, and sequential accesses were much, much faster than random accesses.
So people developed algorithms that maximised locality of reference, like B-trees. An interesting line of work in this vein is that of cache-oblivious algorithms :)

MaxBarraclough 8 years ago

> even if you don't buy that it clearly can't be less than O(N^1/3)

Maybe I'm missing something, but why's that?

To the rest of your comment: all good points.

voidmain 8 years ago
Well, if with your best technology you can pack B bits of information in 1 m^3 of space, and you need to store N bits of information, you are going to need (N/B) m^3 of space to store it. And then the speed of light limits you to around (N/B)^(1/3)/c seconds of latency to get to the farthest bits.
(This ultimately violates the Bekenstein bound - as your server cube grows it will eventually collapse into a black hole - but this is not really a concern with forseeable technology! More practically, though, Earth's gravity makes it hard to build very high in the air, which limits you to O(N^1/2), and things like cooling and maintenance access probably impose a limit somewhere in between.)
- MaxBarraclough 8 years ago
  
  Makes sense, thanks.
adrianN 8 years ago

Imagine your bits stored in a sphere around your CPU. Take into account the time light needs to reach the bit you want to touch.