Comment by addaon
15 hours ago
> "these two loops compute the same value"
At what sequence point? The branchless version writes to small_numbers[smlen], for any given value of smlen, potentially more than once; so there are observable points of time during the loop where the behavior is different. But after the loop, both contain the final write to small_numbers[i] for all 0 <= I < smlen; and the transient writes both don't change observed external behavior, and are apparently cheaper than fewer but conditional writes.
I think the small_numbers array would differ after the end of the loop if, for instance, numbers contained only numbers >= 500. Am I wrong?
smlen would be 0 for both if there are no small numbers, so end result of both is an empty array.
For the first version small_numbers[0] will contain an arbitrary value at the end, and for the second version it happens to contain the last number read, but that address is outside of the 0-length array being returned.
The point is that you should look only at the first smlen entries, which would be 0 for this case.