Comment by geokon

5 years ago

I've never had my problem scale to the size that required a database/SQL, but I don't quite get the advantage of your solution. Having all your interactions with data have to go to disk though a cache muddles things b/c it makes it much harder to reason about performance (b/c when do you have a cache miss? and how do you configure a cache properly?) You introduce a lot more blackmagic variables to reason about.

If you're editing images I'd think it'd just makes more sense to have all of your stuff in RAM and then a saving-to-disk is done on a separate thread. I don't quite get why the users would stop saving in this example.

I'm not saying you're wrong - but more asking for some more details b/c I've never imagined using a DB on data that can fit in RAM

2 comments

geokon

HelloNurse 5 years ago

It's primarily a problem of inflexibility handicapping performance, not of "cache misses" and clever algorithms.

For example, imagine a word processing program opening a document and showing you the first page: you could load 50MB of kitchen sink XML and 250 embedded images from a zip file and then start doing something with the resulting canonical representation, or you could load the bare minimum of metadata (e.g. page size) from the appropriate tables and the content that goes in the first page from carefully indexed tables of objects. Which variant is likely to load faster? Which one is guaranteed to load useless data? Which one can save the document more quickly and efficiently (one paragraph instead of a whole document or a messy update log) when you edit text?

geokon 5 years ago

ah okay, incremental loading seems essential and I hadn't considered it. Thanks for explaining :)