← Back to context

Comment by moth-fuzz

10 hours ago

I'm a huge fan of the 'parse, don't validate' idiom, but it feels like a bit of a hurdle to use it in C - in order to really encapsulate and avoid errors, you'd need to use opaque pointers to hidden types, which requires the use of malloc (or an object pool per-type or some other scaffolding, that would get quite repetitive after a while, but I digress).

You basically have to trade performance for correctness, whereas in a language like C++, that's the whole purpose of the constructor, which works for all kinds of memory: auto, static, dynamic, whatever.

In C, to initialize a struct without dynamic memory, you could always do the following:

    struct Name {
        const char *name;
    };

    int parse_name(const char *name, struct Name *ret) {
        if(name) {
            ret->name = name;
            return 1;
        } else {
            return 0;
        }
    }

    //in user code, *hopefully*...
    struct Name myname;
    parse_name("mothfuzz", &myname);

But then anyone could just instantiate an invalid Name without calling the parse_name function and pass it around wherever. This is very close to 'validation' type behaviour. So to get real 'parsing' behaviour, dynamic memory is required, which is off-limits for many of the kinds of projects one would use C for in the first place.

I'm very curious as to how the author resolves this, given that they say they don't use dynamic memory often. Maybe there's something I missed while reading.

If you don't want your types to be public, don't put them in the public interface, put them into the implementation.

You can play tricks if you’re willing to compromise on the ABI:

    typedef struct foo_ foo;
    enum { FOO_SIZE = 64 };
    foo *foo_init(void *p, size_t sz);
    void foo_destroy(foo *p);
    #define FOO_ALLOCA() \
      foo_init(alloca(FOO_SIZE), FOO_SIZE)

Implementation (size checks, etc. elided):

    struct foo_ {
        uint32_t magic;
        uint32_t val;
    };
    
    foo *foo_init(void *p, size_t sz) {
        foo *f = (foo *)p;
        f->magic = 1234;
        f->val = 0;
        return f;
    }

Caller:

    foo *f = FOO_ALLOCA();
    // Can’t see inside
    // APIs validate magic

> But then anyone could just instantiate an invalid Name without calling the parse_name function and pass it around wherever

This is nothing new in C. This problem has always existed by virtue of all struct members being public. Generally, programmers know to search the header file / documentation for constructor functions, instead of doing raw struct instantiation. Don‘t underestimate how good documentation can drive correct programming choices.

C++ is worse in this regard, as constructors don‘t really allow this pattern, since they can‘t return a None / false. The alternative is to throw an exception, which requires a runtime similar to malloc.

  • In C++ you can do: struct Foo { private: int val = 0; Foo(int newVal) : val(newVal) {} public: static optional<Foo> CreateFoo(int newVal) { if (newVal != SENTINEL_VALUE) { return Foo(newVal); } return {}; } };

        int main(int argc, char* argv[]) {
          if (auto f = CreateFoo(argc)) {
            cout << "Foo made with value " << f.val;
          } else {
            cout << "Foo not made";
          }
        }

  • In C++ you would have a protected constructor and related friend utility class to do the parsing, returning any error code, and constructing the thing, populating an optional, shared_ptr, whatever… don’t make constructors fallible.