← Back to context

Comment by imtringued

2 years ago

> Correct string handling means that you always know how long your strings are

Well, I couldn't think of a stronger argument against NULL terminated strings than this. After all, NULL terminated strings make no guarantee about having a finite length. Nothing prevents you from building a memory mapped string that is being generated on demand that never ends.

Except that's a non-sequitur because you can totally keep separate string length fields.

The only NUL that C requires is the NUL following C string literals, and you can even easily define char-arrays without NUL.

    char buf[5] = "Hello";

or even

    #define DEFINE_NONZ_STRING(name, lit) char name[sizeof lit - 1] = lit "";

Can also easily build pointer + length representations, without even a runtime strlen() cost.

    struct String { const char *buf, int len; };
    #define STRING(lit) ((String) { (lit ""), sizeof lit - 1 })

  • What do you do when the strings might have more than MAX_INT characters?

    • What will you do on your 200th birthday?

      In case you're more interested in theory than practice, I have a different answer: I use a different API.

      However, I'm aware not even that could stop you, because you could still ask "what do you do when the strings might have more than SIZE_MAX characters?", which is entirely possible (as a combination of 2 or more strings).

      And to answer that, we're coming back to my original answer: It doesn't happen. I'm not calling the API with such huge strings. (And no, I usually don't keep formal proofs that it couldn't happen -- there are also an infinite number of other properties that I don't verify).

    • INT_MAX is often far less than SIZE_MAX (the former is usually the max of a signed 32-bit integer, the latter of an unsigned 64-bit integer), so usually nothing special.

      2 replies →

    • If that is ever a possible issue, you switch the implementation to use two pointers.

          struct String { const char *buf, const char *buf_end; };

      3 replies →

  • What's the point of the empty string literal "" ?

    • It's a poor man's assertion that the "lit" is indeed a string literal (such that we can get the string length using sizeof) and not a variable (of pointer type, where sizeof returns the size of the pointer not the string-buffer). If you pass a variable instead of a string literal, that will be a syntax error.