← Back to context

Comment by hyperpallium

10 years ago

This also comes up in relational databases. There might be a nice, canonical way to represent what the data really is (i.e. what it represents). but then access patterns for how it is used mean that a different representation is better (usually, somewhat de-normalized). Fortunately, relational algebra enables this (one of Codd's main motivations).

A programming language is even more about data processing that a database is. But it still seems that data structures/objects represent something. I recently came up with a good way to resolve what that is:

  In a data processing (i.e. programming) language what you are
  modelling/representing is not entities in the world, but computation.
  Therefore, choose data structures that model your data processing. 

This definition allows for the messy denormalized-like data structures you get when you optimize for performance. It also accounts for elegant algebra-like operators, that can be easily composed to model different computation (like the +, . and * of reg exp).