← Back to context

Comment by alexfoo

5 hours ago

Exactly. A wrapper that handles all of the edge cases properly and gives proper reporting just gets added to your own library of functions and the devs get used to using it. Much like the code for abstract data types like lists/hashmaps/etc which neither C nor the standard libraries provide.

Bonus points for having bespoke linting rules to point out the use of known “bad” functions.

In one old project we went through and replaced all instances of sprintf() with snprintf() or equivalent. Once we were happy that we’d got every occurrence we could then add lint rules to flag up any new use of sprintf() so that devs didn’t introduce new possible problems into the code.

(Obviously you can still introduce plenty of problems with snprintf() but we learned to give that more scrutiny.)

While snprintf() is better than sprintf(), I find that it's easy for people to not check if the return value is bigger than the provided size. Sure, it prevents a buffer overflow, but there could still be a string truncation problem.

Similar to how strlcpy() is not a slam dunk fix to the strcpy() problem.

  • That's partly the point.

    If someone uses sprintf() you have to go faffing around to check whether they've thought about the destination buffer size. The size of the structure may be buried far away through several layers of other APIs/etc.

    Using snprintf() doesn't solve this in any way, but checking whether the new use of snprintf() checks the return value is relatively simple. Again, there's still no guarantee that there aren't other problems with snprintf() but, in our experience, we found that once people were forced to use it over sprintf() and had things checked in PR reviews we found that the number of instances of misuse dropped dramatically.

    It wasn't the switch of functions that reduced the number of problems we saw, but the outright banning of the known footgun `sprintf()` and the careful auditing and replacement of it with `snprintf()` that served as a whole load of reference copies for how to use it. We spread the work of replacing `sprintf()` around the team so that everyone got to do some of the switches and everyone got to review the changes. And we found a whole load of possible problems (most of which were very unlikely to ever lead to a crash or corruption.)

    The same would apply if you picked any other known footgun and did similar refactoring/rewrites/auditing/etc.

    Anyway, I haven't done C commercially/professionally for about 5 years now. I do miss it though.

> like lists/hashmaps/etc which neither C nor the standard libraries provide

There is a hashmap implementation though: https://man7.org/linux/man-pages/man3/hsearch.3.html

  • “One hashmap for your entire program” is not generally what people mean when they want a hashmap.

    • > The three functions hcreate_r(), hsearch_r(), hdestroy_r() are reentrant versions that allow a program to use more than one hash search table at the same time.

  • Sure there's an implementation, but like the integer comparison functions that sparked this thread there are some severe limitations with the implementation.

    (In fact, looking at it again, I assume I'd purposely purged it from my memory given how terrible it is.)

    The non-extensible nature is the biggest one. There are plenty of times when the maximum number of elements needed to be stored will be known in advance. (See the note about hcreate().)

    Secondly the hserach() implementation requires the keys to be NUL terminated strings since "the same key" is determined using strcmp(). Good luck if you want to use a number, pointer, arbitrary structure or anything else as a key.

    Any reasonable hash table implementation would not have either of these limitations.

    Maybe I needed to say:

    > > like lists/hashmaps/etc which neither C nor the standard libraries provide

    ... reasonable implementations of.