← Back to context

Comment by af78

2 days ago

One I can think of is simplicity. No need to worry about what the type of the string should be (size_t?) or where it should be stored. Just pass around a pointer. Pointers fit the size of a CPU register most of the time. Though in my opinion the drawbacks (O(N) performance, NUL forbidden etc.) outweigh this benefit we are stuck. Many kernel interfaces like open, getdents etc. assume NUL-terminated strings, therefore any low-level language or library has to support them.

But (i32 length, byte[] data) is as complex as (byte[] data, '\0'), its two-parts anyway. Of course it allows potentially for very long strings at the cost of just a single byte spent as a terminator. Beside the rarity of such a case, the "space savings" might play a role on a PDP11, or on a Z80, but not on any of the modern architectures that need structures aligned to 32 or even 64 bit boundary. The efficiency and security costs far outweigh any savings is space or simplicity (heh) of processing.

Null-terminated strings are the other billion-dollar mistake, along with the original NULL.

  • Arrays as glorified pointers were the mistake. Null terminated strings are a natural result of that design choice.

    Null pointers however were not a mistake, despite how popular slandering them has become. A reasonable case can be made that any modern language should enforce null checks (and bound checks, and ...) or at the least provide them by default but that is neither here nor there as far as C is concerned.

    • Tony Hoare himself called NULL a mistake. But the problem is not in the ability to set a pointer to a null value, of course. The problem is that all pointers are nullable, and there's no way to statically enforce their being non-null. I wonder how feasible data flow analysis would be in 1969 though.