Comment by adrian_b

21 hours ago

Indexing does that, but the indices must vary in a certain range, whose limits are frequently determined by using something like "sizeof(array)/sizeof(element)" which is an unsigned number.

This is especially inconvenient in C, where there exist extremely dangerous legacy implicit casts between signed integers and unsigned integers, which have a great probability of generating incorrect values.

Because the index is typically a signed integer, comparing it with an unsigned limit without using explicit casts is likely to cause bugs. Using explicit casts of smaller unsigned integers towards bigger signed integers results in correct code, but it is cumbersome.

These problems are avoided as said in TFA, by making "sizeof" and the like to have 64-bit signed integer values, instead of unsigned values.

Well chosen implicit conversions are good for a programming language, by reducing unnecessary verbosity, but the implicit integer conversions of C are just wrong and they are by far the worst mistake of C much worse than any other C feature.

Other C features are criticized because they may be misused by inexperienced or careless programmers, but most of the implicit integer conversions are just incorrect. There is no way of using them correctly. Only the conversions from a smaller signed integer to a bigger signed integer are correct.

Mixed signedness conversions have always been wrong and the conversions between unsigned integers have been made wrong by the change in the C standard that has decided that the unsigned integers are integer residues modulo 2^N and they are not non-negative integers.

For modular integers, the only correct conversions are from bigger numbers to smaller numbers, i.e. the opposite of the implicit conversions of C. The implicit conversions of C unsigned numbers would have been correct for non-negative integers, but in the current C standard there are no such numbers.

The current C standard is inconsistent, because the meaning of sizeof is of a non-negative integer and this is also true for the conversions between unsigned numbers, but all the arithmetic operations with unsigned numbers are defined to be operations with integer residues, not operations with non-negative numbers.

The hardware of most processors implements at least 3 kinds of arithmetic operations: operations with signed integers, operations with non-negative integers and operations with integer residues.

Any decent programming language should define distinct types for these kinds of numbers, otherwise the only way to use completely the processor hardware is to use assembly language. Because C does not do this, you have to use at least inline assembly, if not separate assembly source files, for implementing operations with big numbers.

3 comments

adrian_b

uecker 21 hours ago

Not sure what change in the C standard you mean. unsingned was always modulo. Otherwise, use -Wsign-conversion.

adrian_b 20 hours ago
Nope.
It was undefined what happens at unsigned overflows and underflows. Therefore a compiler could choose to implement "unsigned" as either non-negative numbers or as integer residues.
The fact that "sizeof" is unsigned and the implicit conversions between "unsigned" numbers are consistent only with non-negative numbers. Therefore the undefined behavior should have been defined correspondingly.
Instead of this, at some version of the standard, I am lazy to search it now, but it might have been C99, they have changed the behavior from undefined to defined as the behavior of integer residues.
I do not know the reason for this choice, it may have been just laziness, because it is easier to implement in compilers and it leads to maximum performance in the absence of bugs. In any case this decision has broken the standard, because the arithmetic operations have become incompatible with the implicit conversions between "unsigned" types and with the semantics of "sizeof", which must be non-negative.
For non-negative numbers, the correct conversions are from smaller sizes to bigger sizes, while for integer residues the correct conversions are only in the opposite direction, from bigger sizes to smaller sizes (e.g. a number that is 257 modulo 65536 is also 1 modulo 256, so truncating it yields a correct value, while a number that is 1 modulo 256 when modulo 65536 it could be 257, 511, 769 etc. so you cannot extend it without additional information).
Judging from the implicit conversions, it is clear that the intention of the designers of C during the seventies was that "unsigned" numbers must be non-negative integers and not integer residues. The modern C standard is guilty of the current inconsistencies that greatly increase the chances of bugs
- uecker 11 hours ago
  
  My copy of K&R already has unsigned modulo arithmetic: "unsigned numbers are always positive or zero, and obey the laws of arithmetic modulo 2n, where n is the number of bits in the type." So if it changed it was before that, but don't think so.
  I get your argument about the conversion order, but I do not buy it in terms of language design. You also do not want to go to a quotient ring implicitly, so I do not agree that this conversion direction would be more "correct" for implicit conversion either and from a practical point of view the C design is defensible.
  I think the motivation originally was merely to expose the common capabilities of the hardware, nothing more. What we miss from this perspective are polynomials over F_2, but nobody pushed for this too hard so far.