Comment by pron

2 months ago

> But I would say if you know a value will logically always be >= 0, better to have a type that reflects that.

Except that's not quite what unsigned types do. They are not (just) numbers that will always be >= 0, but numbers where the value of `1 - 2` is > 1 and depends on the type. This is not an accident but how these types are intended to behave because what they express is that you want modular arithmetic, not non-negative integers.

> e.g. "x--" won't compile without explicitly supplying a case for x==0

If you want non-negative types (which, again, is not what unsigned types are for) you also run into difficulties with `x - y`. It's not so simple.

There are many useful constraints that you might think it's "better to have a type that reflects that" - what about variables that can only ever be even? - but it's often easier said than done.

9 comments

pron

norir 2 months ago

This is true, which means that a language has to be designed from the ground up to deal with these problems or there will always be inscrutable bugs due to misuse of arithmetic results. A simple example in a c-like language would be that the following function would not compile:

    unsigned foo(unsigned a, unsigned b) { return a - b; }

but this would:

    unsigned foo(unsigned a, unsigned b) {
      auto c = a - b;
      return c >= 0 ? c : 0;
    }

Assuming 32 bit unsigned and int, the type of c should be computed as the range [-0xffffffff, 0xffffffff], which is different from int [-0x100000000, 0x7fffffff]. Subtle things like this are why I think it is generally a mistake to type annotate the result of a numerical calculation when the compiler can compute it precisely for you.

pron 2 months ago

First, your code is about having unsigned types represent the notion of non-negative values, but this is not the intent of unsigned types in C/C++. They represent modular arithmetic types.
Second, it's not as simple as you present. What is the type of c? Obviously it needs to be signed so that you could compare it to zero, but how many bits does it have? What if a and b are 64 bit? What if they're 128 bit?
You could do it without storing the value and by carrying a proof that a >= b, but that is not so simple, either (I mean, the compiler can add runtime checks, but languages like C don't like invisible operations).

Groxx 2 months ago

That's true for signed numbers too though? `int_min - 2 > int_min`

I agree they're a bit more error-prone in practice, but I suspect a huge part of that is because people are so used to signed numbers because they're usually the default (and thus most examples assume signed, if they handle extreme values correctly at all (much example code does not)). And, legitimately, zero is a more commonly-encountered value... but that can push errors to occur sooner, which is generally a desirable thing.

pron 2 months ago

> That's true for signed numbers too though? `int_min - 2 > int_min`
As someone else already pointed out, that's undefined behaviour in C and C++ (in Java they wrap), but the more important point is that the vast majority of integers used in programs are much closer to zero than to int_min/max. Sizes of buffers etc. tend to be particularly small. There are, of course, overflow problems with signed integers, but they're not as common.
oasisaimlessly 2 months ago
> That's true for signed numbers too though? `int_min - 2 > int_min`
No, that's undefined behavior in C, and if you care about correctness, you run at least your testsuite in CI with -ftrapv so it turns into an abort().
- Groxx 2 months ago
  
  Which makes them even less safe than unsigned, where it is defined, yes? The optimizations that can lead to are incredibly hard to predict.
  Besides, for safety there are much clearer options, like wrapping_add / saturating_add. Aborting is great as a safety tool though, agreed - it'd be nice if more code used it.
  
  1 reply →

ks2048 2 months ago

> you also run into difficulties with `x - y`.

If you have "uint x" and "uint y", then for "x - y", the programmer should explicitly write two cases (a) no underflow, i.e. x >= y, and (b) underflow, x < y. The syntax for that... that is an open question.

> what about variables that can only ever be even

Yes, maybe you should have an "EvenInt" type, if that is important. Maybe you should be able to declare a variable to be 7...13, just like a "uint8" can declare something 0...255. Of course, the type-checker can get complicated, and perhaps simply fail to type-check some things. But, having compile-time constraints to what you know your variables will be is good, IMHO.

renox 2 months ago

Note that in Zig, unsigned integer have the sqle semantic qs integers on overflow (trap or wrap or UB). You also have operators providing wrapping. That is the correct solution.