Comment by didgetmaster

1 month ago

Several of your assumptions are correct.

My project Didgets (short for Data Widgets), started out as a file system replacement. I wanted to create an object store that would store traditional file data, but also make file searches much faster and more powerful than other file systems allow; especially on systems with hundreds of millions of files on them. To enhance this, I wanted to be able to attach contextual tags to each Didget that would make searches much more meaningful without needing to analyze file content during the search.

To facilitate the file operations, I needed data structures to support them. I decided that these data structures (used/free bitmaps, file records, tags, etc.) should be stored and managed within other Didgets that had special handling. Each tag was basically a key-value pair that mapped the Didget ID (key) to a string, number, or other data type (value).

Rather than rely on some external process like Redis to handle tags, I decided to build my own. Each defined tag has a data type and all values for that tag are stored together (like column values in a columnar store). I split the tag handling into two distinct pieces. All the values are deduplicated and reference counted and stored within a 'Values Didget'. The keys (along with pointers to the values) are stored within a "Links Didget'.

This makes analytic functions fast (each unique value is stored only once) and allows for various mapping strategies (one-to-one, one-to-many, many-to-one, or many-to-many). The values and the links are stored within individual blocks that are arranged using hashes and other meta-data constraints. For any given query, usually only a small number of blocks need to be inspected.

I expected analytic operations to be very fast, like with other OLAP systems; but I was pleasantly surprised at how fast I could make traditional OLTP operations run on it.

I have some short demo videos that show not only what it can do, but also benchmark many operations against other databases. Links to the videos are in my user profile.