← Back to context

Comment by ncruces

9 hours ago

It's not really signed vs unsigned that's the issue, IMO. It's (mostly, in C) undefined behavior and implicit conversions?

I'm not sure Go is saner just because len is an int. Well, maybe, depending on how you look at it. Defining len to be signed int, means the largest valid len is half your address space, which also means half of all possible indexes are always invalid; which makes some things easier.

But it's really that integer arithmetic is not undefined behavior regardless of signedness, that bounds are checked, and that even indexing your slice with an int64 on a 32-bit CPU does the full correct bounds check. In fact, you can use any integer type as an index.

Given all of the above, indexing with a uint or an int is actually indiferent. In that case, the bound check is a single unsigned <len compare (despite the fact that len is signed).

What's really painful, is trying to handle a full 32-bit address space with 32-bit addresses and sizes, like in Wasm; you need 33-bit math. So in a sense, limiting sizes to 31-bit (signed) does help. But at the language level, IMO, the rest matters more.

For signed overflow we have sanitizers, and for conversions C compilers warnings in C. Bounds checking can also be done with sanitizers (but is a bit more tricky). So no, I do not think the undefined behavior is really a big problem. In fact, it helps us find the problem because every overflow can be considered a programming error.

Error due to unsigned wraparound are a much bigger issue, because the lead to subtle issues where neither automatic warnings nor sanitizers help, exactly because it is well-defined and no automatic tool can tell whether the behavior is intended or wrong.

  • Do you always run with those sanitizers in place?

    Just this week I've had a C compilers silently delete me an entire function call because of UB (infinite loop without side effects). Took me a day to figure out. So that's a problem for me.

    I don't think I've ever had an hard to debug issue in Go because of signed/unsigned wrap around. Particularly a memory issue.

    If anything, and there I guess I agree with the article, I wish Go had implicit conversions to wider types: to make the problematic ones stand out.

    I guess the reason it doesn't is that they're different named types, which would be a problem when you create a named type for the purpose of forcing explicit type conversions. But maybe the default ones could implicitly implement a numeric tower, where exact conversions can be implicit.

    • That depends. But some sanitizer are cheap enough that you can usually always run them.

      Regarding infinite loops, C++ and C differ with C++ being more aggressive. But also compilers differ with clang being more aggressive. https://godbolt.org/z/Moe6zYKqo

      In general, I do not recommend to use clang if you worry about UB. gcc is a bit more reasonable and also has better warnings.

  • > Error due to unsigned wraparound are a much bigger issue

    This is a type design mistake. The unsigned integers should not wrap by default. It makes absolute sense, given all the constraints and the fact that it's doing New Jersey "implementation simplicity dominates" design that K&R C only provides a wrapping unsigned type, but that's an excuse for K&R C which is a 1960s programming language.

    The excuse gets shakier and shakier the further you move past that. C3 even named these types differently, so they're certainly under no obligation to provide the wrapping unsigned integers as if that's just magically what you mean. In most cases it's not what you mean. The excuse given in the article is way too thin.

    Rust's Wrapping<u32> is the same thing as the wrapping 32-bit unsigned integer in C or C++ today, but most people don't use it because they do not actually want the wrapping 32-bit unsigned integer. This is a "spelling matters" ergonomics class again like the choice to name the brutally fast but unstable general comparison sort [T]::sort_unstable whereas both C and C++ leave the noob who didn't know about sort stability to find out for themselves because they name this just "sort" and you get to keep both halves when you break things...

    • Unsigned is certainly a misnomer for a wrapping type. That does not mean it is a type design mistake. And I agree that people should not use it much.

      But what I do not believe is that there is a real need for a non-wrapping non-negative integer type.

      6 replies →