Comment by ModernMech

3 months ago

It makes sense from a historical perspective. Tables are a thing in many languages, just not the ones that mainstream devs use. In fact, if you rank programming languages by usage outside of devs, the top languages all have a table-ish metaphor (SQL, Excel, R, Matlab).

The languages devs use are largely Algol derived. Algol is a language that was used to express algorithms, which were largely abstractions over Turing machines, which are based around an infinite 1D tape of memory. This model of 1D memory was built into early computers, and early operating systems and early languages. We call it "mechanical sympathy".

Meanwhile, other languages at the same time were invented that weren't tied so closely to the machine, but were more for the purpose of doing science and math. They didn't care as much about this 1D view of the world. Early languages like Fortran and Matlab had notions of 2D data matrices because math and science had notions of 2D data matrices. Languages like C were happy to support these things by using an array of pointers because that mapped nicely to their data model.

The same thing can be said for 1-based and 0-based indexing -- languages like Matlab, R, and Excel are 1-based because that's how people index tables; whereas languages like C and Java are 0-based because that's how people index memory.

1 comment

ModernMech

cb321 3 months ago

As a slight refinement of your point, C does have storage map based N-D arrays/tensors like Fortran, just with the old column-major/row-major difference and a clunky "multiple [][]" syntax. There was just a restriction early on to need compile-time known dimensions to the arrays (up to the final dimension, anyway) because it was a somewhat half-done/half-supported thing - and because that also fit the linear data model well. So, it is also common to see char *argv[] like arrays of pointers or in numerics sometimes libraries which do their own storage map equations from passed dimensions.

Also, the linear memory model itself is not really only because of Algol/Turing machines/theoretical CS/"early" hardware and mechanical sympathy. DRAM has rows & columns internally, but byte addressability leads to hiding that from HW client systems (unless someone is doing a rowhammer attack or something). More random access than tape rewind/fast forward is indeed a huge deal, but I think the actual popularity of linearity just comes from its simplicity as an interface more than anything else. E.g.s, segmented x86 memory with near/far pointers was considered ugly relative to a big 32-bit address space and disk files and other allocation arenas have internally a large linear address/seek spaces. People just want to defer using >1 number until they really need to. People learn univariate-X before they learn multivariate-X where X could be calculus, statistics, etc., etc.