Comment by Levitating

1 year ago

I can't say I appreciate Lua any more after reading this. It only reinforced my annoyances with it.

It's been a while since I wrote any Lua but I didn't think it was intuitive at all. Why are arrays tables, why are they nil terminated and why do they start at 1.

I don't think it's underrated either. It's very popular for writing scripts in (older) game engines.

> Why are arrays tables

For the same reason why (standard) arrays are associative in PHP and JS, as well - because, from a certain perspective, a regular array is just an associated array with consecutive integers as keys, so if you're trying to keep things simple, it seems like a straightforward unification.

> why are they nil terminated

It follows from them being associated arrays, and Lua's pervasive use of "nil" meaning "this thing is missing".

> why do they start at 1

Because humans generally start counting from 1.

  • > Because humans generally start counting from 1.

    I’m not parent, but for me the question is not literally why did they choose 1-based indexing. That’s easy to answer. The question is really: why haven’t they realized by now that the disadvantages of 1-based indexing far outweigh the benefits when it comes to programming, and changed it? Backward compatibility is a valid reason, but lots of other people have tried 1-based arrays and reverted to 0-based once it becomes clear that 1-based arrays causes more problems than it solves. Even humans who count starting from one will soon want to use 0-based the moment they need to add two array indexes together; we don’t want to have to start from 2 in that case, right? Early versions of the famous Numerical Recipes in C started with 1-based arrays, but removed them, and got slightly better. :P

  • It is a misconception that a JS array is "just an associated [sic] array with consecutive integers as keys". Even the early (most readable) editions of the ECMA spec have a whole set of carve-outs that belies that claim.

> Why are arrays tables

I might be able to provide at least some insight into that.

One of the things I found while implementing my language was that the ordering of elements in a hash table is essentially random. Literally random if you mix in some randomness into the hash function's initial state in order to harden it against collision attacks.

Many languages guarantee insertion ordering. It's a useful property to have, makes testing easy and deterministic and it's intuitive for users. A simple way to implement that is to put a perfectly normal array of values into the hash table, and then have the buckets which the keys hash into contain indexes into that array. That way the indexes of values remain stable no matter what the keys hash to.

So a sufficiently advanced hash table will have an array of values inside it. I suppose some language designers took one look at that situation and decided to just merge both types already by using hash tables for everything. For example, PHP also did that. Associative arrays, they call it.

  • CPython's hash table (dict) evolved to have an array of values indexed by a hash table of indices for performance reasons, but later decided to guarantee the insertion ordering that this permitted for iteration.

    "Associative array" is just the term used for hash tables (dicts) in Awk, bash, Tcl, and, most relevantly for PHP, Perl4. It doesn't specifically mean that the language lacks a separate container type for small-integer-indexed arrays. It just means that that particular container type can be indexed by something other than small integers, usually strings.

    • > CPython's hash table (dict) evolved to have an array of values indexed by a hash table of indices for performance reasons

      In case anyone is curious about it:

      https://mail.python.org/pipermail/python-dev/2012-December/1...

      > In addition to space savings, the new memory layout makes iteration faster.

      > Currently, keys(), values, anditems() loop over the sparse table, skipping-over free slots in the hash table.

      > Now, keys/values/items can loop directly over the dense table, using fewer memory accesses.

      The "dense table" is the array of values. The "sparse table" is the actual hash table, only the values index into the values array.

      I thought the insertion ordering was implemented at around the same time as a result of it. The data structure ensures it pretty much for free, seemed like the obvious next step. It appears I was mistaken about that.

      1 reply →

    • Yes, this name is basically a reflection of the fact that a hash table is an array, indexes of which get calculated by `hashof(key) % arr.length`.

      It doesn’t mean HTs are the arrays which should be used to store seqint-indexed values.

      1 reply →

  • You described how every language without hashtable-as-an-object works. The only difference is when you address fields.

    But I fail to see the logic behind seeing this and merging array/table ideas. Feels like one of these irrational decisions you make in a dream.

    And Lua doesn’t use hash table to store arrays. Its tables have two different parts (hash, indexed array) with foggy distribution logic which reflects into “# vs ipairs” issue. This merge is purely synthetic and doesn’t project any underlying implementation idea.

    • Merging the concepts of arrays and maps was very popular in the 1980s and 1990s. I can count at least Awk, Tcl, Lua, JavaScript and PHP as languages that went with this approach.

      Looking from today's perspective, I also fail to see the logic behind it: separating arrays and dictionaries does not make your implementation more complicated than having a single backing data structure for both and taking care of all the special cases that arise from such a monstrosity.

      The best explanation I can think of is that this was just a fad nobody questioned too much. Back in the 1980s and 1990s, there was a strong distinction between "scripting languages" and general-purpose dynamic languages (like Lisp and Smalltalk). Scripting languages were supposed to be used to write small scripts to automate stuff. Type-safety was not something that any mainstream language considered back then (it was a thing in niche languages like Ada, SML and Eiffel), and speed was not a concern for scripting languages.

      A common philosophy (though I never seen it clearly stated) for scripting languages seems to have been to "be liberal and try to work with whatever the programmer does".

      So you shouldn't require to declare variables in advance (that's a lot of hassle!) or throw an error if a function is called with the wrong number of arguments.

      And what if the programmer tries to add an integer and a number written as string? Just coerce them. Or better yet, make everything a string like Tcl does. Or at the very least, don't be a spoilsport and have separate types for integers and floating point numbers!

      Ok, so what if the programmer tries to use a string index? We cannot coerce non-numeric strings! Well, why don't we just unify arrays and dictionaries like we've just unified all our number types? Now you don't have to worry about invalid indexes since all indexes are converted to strings! And if the programmers want to use 0-based, 1-based, 2-based or 42-based indexes they can do whatever they want! They can even have non-contiguous arrays or mix arrays with associative dictionaries. Total freedom.

      This thought process is completely imaginary. I suspect not a lot of thought was given to this type of flexibility at all back in the day. It was just a widely accepted assumption that scripting languages should be flexible, free and relaxed. And merging arrays and dictionaries (or "all arrays are associative arrays" as it was often called back then) was just one more idea that naturally fit into the zeitgeist, like type coercion and "assignment to an unknown variable sets a global variable by default".

      So tldr: it was a fad.

      2 replies →

> It's very popular for writing scripts in (older) game engines.

Exactly, it seems popular in places where the alternative is even less usable (e.g. write your own C++ plugin or thing). For that it's fine.

For example, there is a lua plugin for x-plane and it's great. I've done some minor things with it even. But that didn't make me a fan of the language. It's fine for simple things. But a bit limited.

The x-plane API is pretty low level and written in C++. So something that sits on top and is a bit more accessible is great. There are a lot of lua scripts available that you can run in there that are pretty nice. But the more complicated ones get pretty hairy quickly. Makes you long for something more structured.

I don't have the time, but I did consider working on adding a websocket API at some point to make it easier to integrate with from e.g. a web page or some kind of standalone process. I think they are now planning to add that feature themselves so there's no need to work on this anymore. And they have a low level UDP API currently that exposes all sorts of things. But a websocket API with some good client libraries for different languages would make a lot of sense.

The system software for newer TI graphing calculators is written in Lua and often emulated to Z80.