Comment by dmz73
1 year ago
I have a really hard time understanding why people like 0 based indexes. They are a relic of C style arrays that are based on and interchangeable with pointers which use offsets that are naturally 0 based. Use in later languages gives us endless off-by-1 issues and rise to "for 0 to count/len/num - 1" or even better range syntax that is start inclusive BUT end exclusive. It is a horrible cludge just to support 1970s language perfomace optimization. Arrays should start and end at whatever start index is required, not at offset 0 of pointer to fist element of array.
Hang on. Off by one issues are the argument frequently given in favour of zero-based indices, not the other way around. For example, let's iterate through items placing them in 3 different groups;
JS:
Lua:
Don't get me wrong. I like Lua, I've made my own IDE for it, https://plugins.jetbrains.com/plugin/14698-luanalysis, but this is definitely not an argument in favour of 1-based indices.
Off by one issues are also an argument given in favour of no indexing.
Array languages typically have a reshaping operator so that you can just do something like:
Does that seem so strange? 0N is just null. numpy has ...reshape([3,-1]) which wouldn't be so bad in a hypothetical numjs or numlu; I think null is better, so surely this would be nice:
Such a function could hide an ugly iteration if it were performant to do so. No reason for the programmer to see it every day. Working at rank is better.
On the other hand, Erlang is also 1-based, and there's no numerl I know of, so I might write:
I don't think that's too bad either, and it seems straightforward to translate to lua. Working backwards maybe makes the 1-based indexing a little more natural.
Does that seem right? I don't program in lua very much these days, but the ugly thing to me is the for-loop and how much typing it is (a complaint I also have about Erlang), not the one-based nature of the index I have in exactly one place in the program.
The cool thing about one-based indexes is that 0 meaningfully represents the position before the first element or not-an-element. If you use zero-based indexes, you're forced to either use -1 which precludes its use for referring to the end of the list, or null which isn't great for complicated reasons. There are other mathematical reasons for preferring 1-based indexes, but I don't think they're as cool as that.
Yes, that is what is so frustrating about this argument every single time it comes up, because both sides in the debate can be equally false, or equally true, and its really only a convention and awareness issue, not a language fault.
It’s such a binary, polarizing issue too, because .. here we are again as always, discussing reasons to love/hate Lua for its [0-,non-0] based capabilities/braindeadednesses..
In any case, I for one will be kicking some Luon tires soon, as this looks to be a delightful way to write code .. and if I can get something like TurboLua going on in TurboLuon, I’ll also be quite chuffed ..
Your second example subtracts and adds 1 nearly arbitrarily, which wouldn't be needed if the convention of the 0-index wasn't so widespread.
You need the first three elements to go into the first group, the next three to go into the second group, and so on. How would you write it?
5 replies →
why not just iterate in steps of three over items for each next group? seems a bit contrived.
Because it's a simplified example to demonstrate the problem. If you do as you've described you need three separate assignments. What happens when the number of groups is dynamic? Nested loop? This is suddenly getting a lot more complicated.
3 replies →
Having done fairly extensive parsing work in Lua and Julia on the one hand (one-based), and Python, Javascript, and Zig on the other (zero-based), the zero-based semiopen standard makes intervals dramatically easier to calculate and work with. It's really the semiopen intervals which make this the case, but as the Word of Dijkstra makes clear, zero-basis comes along for the ride, to combine semiopen intervals with a one-basis is perverse.
Naturally it's true that for collections and naïve indexing, 1-based is more natural. But those are rare places for bugs to occur, while interval calculations are a frequent place for them to occur.
Clearly I'm far from allergic to the other standard, but I come down on the side of the zero basis for that reason.
I have a really hard time understanding why people like 1-based indexes! 0 is the smallest unsigned integer in every programming language I know of that supports the concept of unsigned integer. Why shouldn’t an array at the smallest possible index correspond to the beginning of the array?
It’s also very natural to think of arr[i] as “i steps past the beginning of arr”. With one-based indexing arr[i] has no natural interpretation that I know of. It’s “i-1 (for some reason) steps past the beginning of arr”. The only reason I can think of to prefer that extra -1 in your formula is just because human languages (at least the ones I know of) work this way — the 42nd element of a sequence, in normal colloquial English, means the one 41 steps past the beginning. But I’m not sure if there is any logical justification for that.
I also, despite being American, find the convention used in many countries of numbering building floors starting with zero to be more logical. I’m on the third floor, how many stories up did I travel to get here? Three.
> Why shouldn’t an array at the smallest possible index correspond to the beginning of the array?
Because then there is no good way to refer to the index before that point: You are stuck using -1 (which means you can't use it to refer to the end of the array), or null (which isn't great either).
> every programming language I know of that supports the concept of unsigned integer
Surely you know Python which uses a signed integer as an index into their arrays: list[-1] is the last element of a list. If they only used one-based indexing then list[1] would be the first and that would be nicely symmetrical. It would also mean that list[i-1] would NEVER refer to a value after ‹i› eliminating a whole class of bugs.
> It’s also very natural to think of arr[i] as “i steps past the beginning of arr.”
I think it's more natural to think of arr[i] as “the ‹i›th element of arr” because it doesn't require explaining what a step is or what the beginning is.
The exact value of ‹i› matters very little until you try to manipulate it: Starting array indexes at one and using signed indexes instead of unsigned means less manipulation overall.
> find the convention used in many countries of numbering building floors starting with zero to be more logical
In Europe, we typically mark the ground-floor as floor-zero, but there are often floors below it just as there are often floors above it, so the floors might be numbered "from" -2 for example in a building with two below-ground floors. None of this has anything to do with arrays, it's just using things like "LG" or "B" for "lower ground" or "basement" don't translate very well to the many different languages used in Europe.
The software in the elevator absolutely doesn't "start" its array of sense-switches in the middle (at zero).
> In Europe, we typically mark the ground-floor as floor-zero,
_Western_ Europe. Eastern Europe prefers 1-based numbering. The reason, typically assumed, is that thermal isolation, required due to colder winters, causes at least one stair segment between entrance and the sequentially first floor.
Python might have used array[~0] instead, where ~ is required, to indicate end-of-list 0-based indexing.
But I guess they wanted to iterate from the end back [-1] to the start [0], making it easy to implement a rotating buffer.
2 replies →
> I think it's more natural to think of arr[i] as “the ‹i›th element of arr” because it doesn't require explaining what a step is or what the beginning is.
Yes, but if you will eventually need to do steps on your array, you better opt for the framework that handles them better. I agree, that if your only task is to name them, then 1 based indexing makes more sense: you do that since diapers, and you do that with less errors.
In India too, the floor at the ground level is called the ground floor (probably that is where the name came from), the one above it is called the first floor, and so on. The convention is probably from British colonial times.
Also LED floor numbers in lifts (elevators) in India start from 0 for the ground floor, as do the buttons that you press to go to specific floors.
Also, Ground Zero.
https://en.m.wikipedia.org/wiki/World_Trade_Center_site
https://en.m.wikipedia.org/wiki/Hypocenter#
> I also, despite being American, find the convention used in many countries of numbering building floors starting with zero to be more logical. I’m on the third floor, how many stories up did I travel to get here? Three.
Alternatively the ground floor is the first floor because it’s the first floor you arrived at when you entered the building.
The same point of view applies to 1-based indexing.
That said I prefer 0-based in programming and 1-based in buildings.
I never understood why they didn't picture the building, with the buttons and the room/apartment numbers at each floor... That would make all conventions clear. Going negative would be obvious, and just indicate which floor the elevator is at with LED's of backlighting.
They never heard of making a UI, and just slapped buttons.
1 reply →
> find the convention used in many countries of numbering building floors starting with zero to be more logical.
Ukrainian here. Multi-floor buildings always have at least one stair section to first floor due to need of thermal basement isolation. (I guess this is not pertaining to Western Europe due to more clement winters.) And, yep, it is called "first" floor. Using zero number is rare but possible (in this case it is called "tsokolny" floor) if a real "basement floor" is present, but in this case still 1-based numbering is preferred.
I'd argue that 1-based indexing is the "natural interpretation". Mathematics is inherently 1-based, and it isn't surprising that languages designed to do mathematics like R, Matlab, Mathematica, Julia all do 1-based arrays because that makes modeling paper mathematics in programs easier.
Sequences in math start with 1 by convention, not for any fundamental logical reason. It’s a reach to say that math is “inherently 1-based”.
1 reply →
> [0-based indexes] are a relic of C style arrays
I don't think this is true. They exist in other disciplines (maths for instance) that have no relationship with C or other programming languages from the 1970s.
> for 0 to count/len/num - 1
I will counter saying that such a for...to syntax is a relic of BASIC.
> or even better range syntax that is start inclusive BUT end exclusive
I know that your "better" is sarcastic, but I actually find left-inclusive+right-exclusive ranges fantastic. They allow perfect partitioning, easy calculation of lenght, etc.
> Arrays should start and end at whatever start index is required
I agree. An accommodating language would let you define both lower and upper bounds of an array, instead of its size.
IIRC some BASIC(s) I've used in the past had a statement called:
OPTION BASE 1
or something like that, to change the starting index to 1.
APL has ⎕IO←0 or ⎕IO←1 to change the starting index (only between 0 or 1, not arbitrarily). It doesn't apply system-wide so different code blocks/files/modules(?) can set or reset it, and portable code has to either set it or adjust for it.
APLCast podcast has an episode mentioning it where they all seem to agree that this is the worst of all worlds, makes sharing code and integrating codebases needlessly bug-prone, and the language picking a single indexing and sticking to it would have been better, even if the choice hadn't gone the way they would have personally chosen.
1 reply →
There are 360 degrees in a circle, and the first entry is 0 degrees. The first time element of a day is 0:00:00(and enough 0s to satisfy whatever resolution you require). These were not established in the 1970s, and somehow pretty much everyone understands and works quite well with these systems.
> There are 360 degrees in a circle, and the first entry is 0 degrees.
To be pedantic, "first" is associated with 1. And a circle does not have a "first" entry, whatever you mean by entry. I think what you're trying to say is that a circle is a continuous arc going from 0 to 360 degrees, but you should recognize that the "starting point" is arbitrary, any point will do, so there isn't really a "first", and that this is not the same as counting because counting is done with natural numbers, which are non-continuous. The problem of 0 VS 1 makes sense only in counting exactly because it's subjective whether you prefer to count from 0 or from 1. Because zero is the absence of anything, I find it hard to start counting from 0 (when you do, your "first" item is actually your zeroth item, and the next item would be the "first"??!), to be honest, despite being completely familiar with doing so since I've used 0-index programming languages my whole life.
If you cut up a circle into n slices (maybe you're drawing a diagram on screen), it's vastly more helpful to think of one of the segments as segment 0 because then the start angle of every segment is index*360/n and the two segments whose border is at your initial angle are the first and last. If you start counting segments at 1, your “first” segment would be some way into the circle, and the two segments whose border is at your initial angle would be the last and the second-last.
2 replies →
No, "first" infers a sequence and is associated with the beginning of that sequence. In the case of a relative heading, the existing heading is 0 degrees. Any change is relative to that origin point. Zero is also not the absence of anything, that would more properly be considered a NULL or NaN.
1 reply →
Also, how many years old are you when you're born? Zero. (at least in mainstream Western culture).
Some countries consider the 1st floor to be the ground floor, others consider the 1st floor to be the floor above the ground floor, which the formerly mentioned countries consider the 2nd floor… I think 0/1-based indexing is more subjective than simply being a “relic of C” or a “horrible kludge” :P
I've been in the US for over a decade and it still occasionally makes me double-take when a room numbered 1xx is on the ground floor
Here’s the ultimate authority on why computer languages should count from zero:
<https://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF>
I find that argument to be written in a terse "mathy" style that makes it a bit hard to follow. So let me try to restate it in more concrete "programmy" terms.
To iterate over an array with "len" elements, it’s most elegant if “len” appears as a loop bound, rather than "len+1" or "len-1". Thus, in 0-based languages we use half-open ranges, whereas in 1-based languages we use closed ranges:
But the second is inelegant when len is zero, because 0 isn’t a valid index at all, so it’s weird for it to appear as a bound.
Yeah, I disagree with Dijkstra on this. And many other things.
Dijkstra, being one of a handful of luminaries in the field of computer science – indeed, he can be said to have created the field itself – can be (provisionally) taken at his word when he claims something. You, on the other hand, being an anonymous user on a discussion forum, will have to present some pretty strong arguments for the rest of us to take you seriously. Your mere disagreement counts for approximately nothing.
>They are a relic of C style arrays
Doesn't it predate that by a good amount? I would think it is a relic of the EEs who built the digital world, those early languages show a great deal more relation to the bare metal than modern languages. Creating an array whose index starts at 1 just doesn't make sense from the discrete logic point of view, you are either wasting an element or adding in an extra step.
But in this day and age how can a language not have ⎕IO ← 0?
I honestly think that most of the problem arises from the fact that we just culturally start counting at 1 when talking about everyday things. As it stands, we're all used to it that way, and then computers come along and show us that counting from 0 is often more useful. So we adjust, but only for computer programming purposes.
If our species had established counting from 0 as the norm right away (element #n is the one that has n elements before it; you think of the number as the number of steps you have to move away from the starting point), then I suspect the reverse would not be true: I don't think anyone would find a situation in which counting from 1 is so much more convenient that it's worth going against the grain of established norm.
So in summary, I think we only think of counting from 1 as natural because it's in our culture. And it's in our culture because ancient superstitious humans had an irrational problem with the number 0.
I absolutely agree with you. People want to start with 1 because English (and presumably a lot of other languages) happen to use the word "first" to refer to the first element of a sequence, and not for any logical reason independent of arbitrary human language.
Slam! Now this guy really knows how to hate on a zero based index!
It's funny that nearly half of all comments are below your comment. The topic seems to unsettle people much more than a new programming language. This is also another example of how a downvote button is primarily misused in practice.
> Arrays should start and end at whatever start index is required
That's what you were indeed able to do with Pascal and also Modula-2, but with Oberon, Wirth came to the conclusion, that other index ranges than 0..n-1 were not needed. In his 1988 paper "From Modula to Oberon" he considers it "inessential" and providing "hardly any additional expressive power", but causing "a hidden computational effort that is incommensurate with the supposed gain in convenience". I think, in the end, it is in the eye of the beholder.
Dijkstra said that 0 was better for reasons.