← Back to context

Comment by wizzwizz4

12 hours ago

No, the CPU doesn't have a special pointer value which is designated invalid (except as far as modern address spaces are so large that you cannot possibly map memory to each address without mirroring). In many OSs, e.g. CP/M, address 0 is actually meaningful. The C idiom of cramming sum-type semantics into the nooks and crannies of a return value that ordinarily means something entirely different is an extremely poor one, and null pointers are the poster child: Tony Hoare's billion-dollar mistake.

It's absolutely fine to have a packed representation of a sum type "under the hood": this is how Rust implements Option<&T> (where T: Thin), for example. It's also fine to expose the layout of this packed representation to the programmer, as C's union does. But it's a huge footgun to have unchecked casts as the default. If not for this terrible convention, C wouldn't have any unchecked implicit casts: something like f(1 + 0.5) performs a coercion, a far more sensible behaviour.

The only reason we're talking about null pointers at all is because they were an influential idea, not because they were a good idea. Likewise with the essay.

While it's narrowly true that CPU instruction sets generally don't have a null-pointer concept, I'm not sure how important that is: the null pointer seems to have been (I don't know enough to be sure) a well-established idiom in assembly programming which carried across naturally to BCPL and C. (In much the same way that record types were, apparently, a common assembly idiom long before they became particularly normal to have in HLLs.) Programmers like being able to null out a pointer field, 0 is an obvious "joker" value, and jump-if-0 instructions tend to be convenient and fast. Whether or not you'd want to say it's "how the hardware works" it does seem to have a certain character of inevitability. Even if the Bell Research guys had disapproved of the idiom they would likely have had difficulty keeping it out of other people's C programs once C became popular. The Hoare ALGOL W thing seems to be more relevant to null pointers in Java and the like.

  • > Programmers like being able to null out a pointer field, 0 is an obvious "joker" value, and jump-if-0 instructions tend to be convenient and fast.

    And there's nothing wrong with that! But you should write it

      union {
        char *ptr;
        size_t scalar;
      } my_nullable_pointer;
      if (my_nullable_pointer.scalar) {
        printf("%s", my_nullable_pointer.ptr);
      }
    

    not:

      char *my_nullable_pointer;
      if (my_nullable_pointer) {
        printf("%s", my_nullable_pointer);
      }
    

    Yes, this takes up more space, but it also makes the meaning of the code clearer. typedef in a header can bring this down to four extra lines per pointer type in the entire program. Add a macro, and it's five extra lines plus one extra line per pointer type. Put this in the standard library, and the programmer has to type a few extra characters – in exchange for it becoming extremely obvious (to an experienced programmer, or a quick-and-dirty linter) when someone's introduced a null pointer dereference, and when a flawed design makes null pointer dereferences inevitable.

    > The Hoare ALGOL W thing seems to be more relevant to null pointers in Java and the like.

    I believe you are correct; but I like blaming Tony Hoare for things. He keeps scooping me: I come up with something cool, and then Tony Hoare goes and takes credit for it 50 years in the past. Who does he think he is, Euler?

> No, the CPU doesn't have a special pointer value which is designated invalid

Sort of right, sort of wrong.

From my understanding: older, simpler, architectures treat memory location zero as a normal memory address. On x86 and x64, the OS can configure the MMU to treat certain pages as invalid. Many years ago, I ran across a reference to Sparcs treating accesses to memory location zero as invalid. In other words, it depends upon which architecture you're dealing with.

  • The 68000 series used 0 as the initial (boot) program counter, and 4 as the initial stack pointer. (I might have those two backwards; it's been a long time.) That meant that they had to be in ROM, which meant that they were not writable. But addresses 8 through 1K were the interrupt vector table, and they did have to be writable.

    This led to strange hardware implementations like "0 and 4 point to 0x800000 and 0x800004 (or wherever the ROM is) until a latch is cleared, then they point to 0" - with the latch being cleared fairly early in the boot process. This let you create a different entry point for soft and hard boot, if you wanted.

    In that implementation, you could read and write to 0, once the latch was cleared.

    Or you could have an implementation where 0 and 4 pointed to ROM always, and you could not have a different entry point for soft boot, and you could not write to 0, ever.

  • Skimming appendix H of https://courses.grainger.illinois.edu/cs423/sp2011/lectures/..., I can't see any special treatment of the zero page, but https://stackoverflow.com/a/22847758/5223757 contains an anecdote about SPARCs not placing a page of zeroes at that address. I expect that's probably an OS restriction, and they considered it safer to modify the in-house software they understood, rather than tinker with the externally-sourced OS's memory management routines, but the anecdote is weak evidence that it might have been a hardware distinction at one point.