Comment by saurik

1 year ago

https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831...

Why numbering should start at zero. -- Dijkstra

5 comments

saurik

Argument by authority.

To me 1 based indexing is natural if you stop pretending that arrays are pointers + index arithmetics. Especially with slicing syntax.

It's one of the things that irked me when switching to Julia from Python but which became just obviously better after I made the switch.

E.g. in Julia `1:3` represents the numbers 1 to 3. `A[1]` is the first element of the array, `A[1:3]` is a slice containing the first to third element. `A[1:3]` and `A[4:end]` partitions the array. (As an aside: `For i in 1:3` gives the number 1, 2, 3.)

The same sentence in python:

`1:3` doesn't make sense on its own. `A[0]` is the first element of the array. `A[0:3]` gives the elements `A[0], A[1]` and `A[2]`. `A[0:3]` and `A[3:]` slice the array.

For Python, which follows Dijkstra for its Slice delimiters, I need to draw a picture for beginners (I feel like the numpy documentation used to have this picture, but not anymore). The Julia option requires you to sometimes type an extra +1 but it's a meaningful +1 ("start at the next element") and even beginners never get this wrong.

That said, it seems to me that for Lua, with the focus on embedding in the C world, 0 index makes more sense.

MattJ100 1 year ago

I admire Dijkstra for many things, but this has always been a weak argument to me. To quote:

"when starting with subscript 1, the subscript range 1 ≤ i < N+1; starting with 0, however, gives the nicer range 0 ≤ i < N"

So it's "nicer", ok! Lua has a numeric for..loop, which doesn't require this kind of range syntax. Looping is x,y,step where x and y are inclusive in the range, i.e. Dijkstra's option (b). Dijkstra doesn't like this because iterating the empty set is awkward. But it's far more natural (if you aren't already used to languages from the 0-indexed lineage) to simply specify the lower and upper bounds of your search.

I actually work a lot with Lua, all the time, alongside other 0-indexed languages such as C and JS. I believe 0 makes sense in C, where arrays are pointers and the subscript is actually an offset. That still doesn't make the 1st item the 0th item.

Between this, and the fact that, regardless of language, I find myself having to add or subtract 1 frequently in different scenarios, I think it's less of a deal than people make it out to be.

saurik 1 year ago

In any language, arrays are inherently regions of memory and indexes are -- whether they start at 0 or 1 -- offsets into that region. When you implement more complicated algorithms in any language, whether or not it has pointers or how arrays are syntactically manipulated, you start having to do mathematical operations on both indexes and on ranges of index, and it feels really important to make these situations easier.
If you then even consider the simple case of nested arrays, I think it becomes really difficult to defend 1-based indexing as being cognitively easier to manipulate, as the unit of "index" doesn't naturally map to a counting number like that... if you use 0-based indexes, all of the math is simple, whereas with 1-based you have to rebalance your 1s depending on "how many" indexes your compound unit now represents.
Certhas 1 year ago

And the reason to dismiss c) and d) is so that the difference between the delimiters is the length. That's not exactly profound either.
If the word for word same argument was made by an anonymous blogger no one would even consider citing this as a definitive argument that ends the discussion.

nmz 1 year ago

Associative arrays are also "nicer" with 1 based indexing

  t={}
  t[#t+1] --> t[1] DONE!
  t[#t==0 and 0 or #t+1]

Now, in order to fix this (or achieve the same behavior as #t+1), the length operator must return -1, which would be ridiculous. its empty, and we have a perfect representation of emptiness in math "0".

This is also true in awk as well, nobody ever whines about awk "arrays" being 0