Comment by llm_nerd
4 hours ago
>but columns aren't the end-all-be-all normalization format. I think pandas uses "frames".
Pandas is column oriented, as are basically all high performance data libraries. Each column is a separate array of data. To get a "row" you take the n item from each of the arrays.
And FWIW, column-oriented isn't considered normalization. It's a physical optimization that can yield enormous performance advantages for some classes of problems, but can cause a performance nightmare for other problems.
Data analytics loves column-oriented. CRUD type stuff does not. And in the programming realm there are several options to have Structures of Arrays (SoA) instead of the classic Arrays of Structures (AoS).
No comments yet
Contribute on Hacker News ↗