Comment by kazinator

3 years ago

He's not wrong. The main reason to have these functions is that other implementations have them, and programs are using them, and having to define those functions themselves when ported to glibc.

One benefit of defining strlcpy yourself is that you can define it as a macro that expands to an open-coded call to snprintf, and then that is diagnosed by GCC; you may get static warnings about possible truncation. (I suspect GCC might not yet be analyzing strlcpy/strlcat calls, but that could change.)

The functions silently discard data in order to achieve memory safety. Historically, that has been viewed as acceptable in C coding culture. There are situations in which that is okay, like truncating some unimportant log message to "only" 1024 characters.

Truncating can cause an exploitable security hole; like some syntax is truncated so that its closing brace is missing, and the attacker is able to somehow complete it maliciously.

Even when arbitrary limits are acceptable, silently enforcing them in a low-level copying function may not be the best place in the program. If the truncation is caused by some excessively long input, maybe that input should be validated close to where it comes into the program, and rejected. E.g. don't let the user input some 500 character field, pretend you're saving it and then have them find out the next day that only 255 of it got saved.

Even if in my program I find it useful to have a truncating copying function, I don't necessarily want it to be silent when truncation occurs. Maybe in that particular program, I want to abort the program with a diagnostic message. I can then pass large texts in the unit and integration tests, to find the places in the program that have inflexible text handling, but are being reached by unchecked large inputs.

2 comments

kazinator

kazinator 3 years ago

Example:

  #include <stdio.h>
  #include <string.h>

  #define strlcpy(dst, src, size) ((size_t) snprintf(dst, size, "%s", src))

  size_t (strlcpy)(char *dst, const char *src, size_t size)
  {
    return strlcpy(dst, src, size);
  }

  int main(void)
  {
    char littlebuf[8];
    strlcpy(littlebuf, "Supercalifragilisticexpealidocious", sizeof littlebuf);
    return 0;
  }


  strlcpy.c: In function ‘main’:
  strlcpy.c:4:63: warning: ‘%s’ directive output truncated writing 34 bytes into a region of size 8 [-Wformat-truncation=]
   #define strlcpy(dst, src, size) ((size_t) snprintf(dst, size, "%s", src))
                                                               ^
  strlcpy.c:14:22:
     strlcpy(littlebuf, "Supercalifragilisticexpealidocious", sizeof littlebuf);
                      ~
  strlcpy.c:14:3: note: in expansion of macro ‘strlcpy’
     strlcpy(littlebuf, "Supercalifragilisticexpealidocious", sizeof littlebuf);
   ^~~~~~~
  strlcpy.c:4:34: note: ‘snprintf’ output 35 bytes into a destination of size 8
   #define strlcpy(dst, src, size) ((size_t) snprintf(dst, size, "%s", src))
                                   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  strlcpy.c:14:3: note: in expansion of macro ‘strlcpy’
     strlcpy(littlebuf, "Supercalifragilisticexpealidocious", sizeof littlebuf);
     ^~~~~~~

If glibc doesn't do something in the header file such that we get similar diagnostics for its strlcpy, we can make the argument that this is detrimental to the program.

xenadu02 3 years ago

There is a hierarchy of bugs involved here. Memory safety is a much more serious class of problem. Obstinately refusing to improve the status quo because it doesn't solve all problems is just plain bad engineering. Doubly so in this case where there exist the "n" variants of string functions that are massive foot guns.