← Back to context

Comment by thaumasiotes

5 years ago

> I assume that's where the slowdown comes from here - scanf needs to implement a ton of features, some of which need the input length, and the implementor expected it to be run on short strings.

I didn't get that impression. It sounded like the slowdown comes from the fact that someone expected sscanf to terminate when all directives were successfully matched, whereas it actually terminates when either (1) the input is exhausted; or (2) a directive fails. There is no expectation that you run sscanf on short strings; it works just as well on long ones. The expectation is that you're intentionally trying to read all of the input you have. (This expectation makes a little more sense for scanf than it does for sscanf.)

The scanf man page isn't very clear, but it looks to me like replacing `sscanf("%d", ...)` with `sscanf("%d\0", ...)` would solve the problem. "%d" will parse an integer and then dutifully read and discard the rest of the input. "%d\0" will parse an integer and immediately fail to match '\0', forcing a termination.

EDIT: on my xubuntu install, scanf("%d") does not clear STDIN when it's called, which conflicts with my interpretation here.

No it would not. Think about what the function would see as its format string in both cases.

The root cause here isn't formatting or scanned items. It is C library implementations that implement the "s" versions of these functions by turning the input string into a nonce FILE object on every call, which requires an initial call to strlen() to set up the end of read buffer point. (C libraries do not have to work this way. Neither P.J. Plauger's Standard C library nor mine implement sscanf() this way. I haven't checked Borland's or Watcom's.)

See https://news.ycombinator.com/item?id=24460852 .

  • Yes, it looks that way. On the unix/linux side of things, glibc also implements scanf() by converting to a FILE* object, as does the OpenBSD implementation.

    It looks like this approach is taken by the majority of sscanf() implementations!

    I honestly would not personally have expected sscanf() to implicitly call strlen() on every call.