Comment by kragen
1 year ago
CPython's hash table (dict) evolved to have an array of values indexed by a hash table of indices for performance reasons, but later decided to guarantee the insertion ordering that this permitted for iteration.
"Associative array" is just the term used for hash tables (dicts) in Awk, bash, Tcl, and, most relevantly for PHP, Perl4. It doesn't specifically mean that the language lacks a separate container type for small-integer-indexed arrays. It just means that that particular container type can be indexed by something other than small integers, usually strings.
> CPython's hash table (dict) evolved to have an array of values indexed by a hash table of indices for performance reasons
In case anyone is curious about it:
https://mail.python.org/pipermail/python-dev/2012-December/1...
> In addition to space savings, the new memory layout makes iteration faster.
> Currently, keys(), values, anditems() loop over the sparse table, skipping-over free slots in the hash table.
> Now, keys/values/items can loop directly over the dense table, using fewer memory accesses.
The "dense table" is the array of values. The "sparse table" is the actual hash table, only the values index into the values array.
I thought the insertion ordering was implemented at around the same time as a result of it. The data structure ensures it pretty much for free, seemed like the obvious next step. It appears I was mistaken about that.
They did implement insertion ordering at that time, but they didn't want to commit to adding it to the official language semantics until having enough experience to know whether they wanted to back out the change to the implementation which provided that guarantee for free.
Yes, this name is basically a reflection of the fact that a hash table is an array, indexes of which get calculated by `hashof(key) % arr.length`.
It doesn’t mean HTs are the arrays which should be used to store seqint-indexed values.
No, that's an implementation detail which isn't actually true of the associative arrays in all of these languages. "Associative array" is the name for the interface, not the implementation. In theory it could be a hash trie or something, although I'm not aware of any implementations of any of these languages that uses anything but a hash table (which is, yes, fundamentally dependent on using an array).