Comment by poly2it
8 hours ago
I still don't understand how these arguments make sense for new code. Naturally, sizes should be unsigned because they represent values which cannot be unsigned. If you do pointer/size arithmetic, the only solution to avoid overflows is to overflow-check and range-check before computation.
You cannot even check the signedness of a signed size to detect an overflow, because signed overflow is undefined!
The remaining argument from what I can tell is that comparisons between signed and unsigned sizes are bug-prone. There is however, a dedicated warning to resolve this instantly.
It makes sense that you should be able to assign a pointer to a size. If the size is signed, this cannot be done due to its smaller capacity.
Given this, I can't understand the justification. I'm currently using unsigned sizes. If you have anything contradicting, please comment :^)
C offers a different solution to the problem in Annex K of the standard. It provides a type `rsize_t`, which like `size_t` is unsigned, and has the same bit width, but where `RSIZE_MAX` is recommended to be `SIZE_MAX >> 1` or smaller. You perform bounds checking as `<= RSIZE_MAX` to ensure that a value used for indexing is not in the range that would be considered negative if converted to a signed integer. A negative value provided where `rsize_t` is expected would fail the check `<= RSIZE_MAX`.
IMO, this is a better approach than using signed types for indexing, but AFAIK, it's not included in GCC/glibc or gnulib. It's an optional extension and you're supposed to define `__STDC_WANT_LIB_EXT1__` to use it.
I don't know if any compiler actually supports it. It came from Microsoft and was submitted for standardization, but ISO made some changes from Microsoft's own implementation.
https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1173.pdf#p...
https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1225.pdf
This is an interesting middle ground. As ncruces pointed out in a sibling comment, the sign bit in a pointer cannot be set without contradicting the ptrdiff_t type. That makes this seem like a reasonable approach to storing sizes.
"Naturally, sizes should be unsigned because they represent values which cannot be unsigned."
Unsigned types in C have modular arithmetic, I think they should be used exclusively when this is needed, or maybe if you absolutely need the full range.
Pointer arithmetic that could overflow would probably involve a heap and therefore be less likely to require a relative, negative offset. Just use the addresses and errors you get from allocation.
Yes, but there are definitely cases where this doesn't apply, for example when deriving an offset from a user pointer. As such this is not a universal solution.
I dont know either.
int somearray[10];
new_ptr = somearray + signed_value;
or
element = somearray[signedvalue];
this seems almost criminal to how my brain does logic/C code.
The only thing i could think of is this:
somearray+=11; somearray[-1] // index set to somearray[10] ??
if i'd see my CPU execute that i'd want it to please stop. I'd want my compiler to shout at me like a little child, and be mean until i do better.
-Wall -Wextra -Wextra -Wpedantic <-- that should flag i think any of these weird practices.
As you stated tho, i'd be keen to learn why i am wrong!
In the implementation of something like a deque or merge sort, you could have a variable that represents offsets from pointers but which could sensibly be negative. C developers culturally aren't as particular about theoretical correctness of types as developers in some other languages - there's a lot of implicit casting being used - so you'll typically see an `int` used for this. If you do wish to bring some rigidity to your type system, you may argue that this value is distinct from a general integer which could be used for any arithmetic and definitely not just a pointer. So it should be a signed pointer difference.
Arrays aren't the best example, since they are inherently about linear, scalar offsets, but you might see a negative offset from the start of a (decayed) array in the implementation of an allocator with clobber canaries before and after the data.
Any kind of relative/offset pointers require negative pointer arithmetic. https://www.gingerbill.org/article/2020/05/17/relative-point...
I don't think you can make such a broad statement and be correct in all cases. Negative pointer arithmetic is not by itself a reason to use signed types, except if you are:
1. Certain your added value is negative.
2. Checking for underflows after computation, which you shouldn't.
The article was interesting.
> It makes sense that you should be able to assign a pointer to a size. If the size is signed, this cannot be done due to its smaller capacity.
Why?
By the definition of ptrdiff_t, ISTM the size of any object allocated by malloc cannot be out of bounds of ptrdiff_t, so I'm not sure how can you have a useful size_t that uses the sign bit?
Stroustrup believes that signed should be preferred to unsigned even for values that can’t be less than zero: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p14...
I've of course read his argument before, and I think it might be more applicable to C++. I exclusively program in C, and in that regard, the relevant aspects as far as I can tell wouldn't be clearly in favour of a signed type. I also think his discussion on iterator signedness mixes issues with improper bounds checking and attributes it to the size type signedness. What remains I cannot see justify using the a signed type other than "just because". I'm not sure it's applicable to C.
I also prefer signed types in C for sizes and indices. You can screen for overflow bugs easily using UBSan (or use it to prevent exploitation).