← Back to context

Comment by nayuki

1 day ago

The problem is that after calling scanf(), the number of variables that are defined is a variable number. For example:

    int x, y, z;
    int n = scanf("%d %d %d", &x, &y, &z);

At compile time, you can make no inferences about which of x, y, and z are defined, because that depends on the returned value n. There are many ways to branch out from this.

One is to insist on definite assignment - so if we cannot prove all of them are always assigned, then we can treat them as "possibly undefined" and err out.

Another way is to avoid passing references and instead allow multiple returns, like Python (this is pseudocode):

    x, y, z = scanf("%d %d %d")

In that case, if the hypothetical `scanf()` returns a tuple that is less than 3 elements or more than 3 elements, then the unpacking will fail at run time and crash exactly at that line.

Another way is like Java, which insists that the return value is a scalar, so it can't do what C and Python can do. This can be painful on the programmer, of course.

I interpret "don't allow unintialized locals when declared" as meaning that this call:

    int n = scanf("%d %d %d", &x, &y, &z);

Would be caught, because it takes references to undeclared variables. To be allowed, the programmer would have to initialize the variables beforehand.

  • Then people would complain about the wasteful initialisation of out-params. Foolishly, perhaps

    • I think it would make sense to have a keyword that permits unsafe instantiation specifically for the edge cases where initialization is too expensive. But I think it makes sense for the lazy case to be a little bit safer.

The idea is that ASAN would replace scanf with a function that does additional book keeping when writing to whatever arbitrary memory location the inputs dictate at runtime.

It's probably what the PR resolving the issue I linked to does. Though I didn't check