Comment by eridius

8 years ago

Ragel shares part of the blame. Why did it use a strict equality check when it could have trivially done a >=?

12 comments

eridius

Even a >= check would have been suboptimal. Rather than

    /* generated code */
    if ( ++p == pe )
        goto _test_eof;

    /* generated code */
    if ( ++p >= pe )
        goto _test_eof;

they should have had

    /* generated code */
    if ( ++p == pe )
        goto _test_eof;
    assert(p < pe);

since having servers core dumping would have drawn attention to the bug in a way that counting one byte too many and then hitting _test_eof would not.

Scaevolus 8 years ago
Some assert() macros are disabled on release builds, so that's not exactly safe either.
- cperciva 8 years ago
  
  True, but I'd hope that NDEBUG is now widely recognized as being a horrible misfeature.
  
  1 reply →

arestor 8 years ago

It's C. If you have an array, you may only compare to one element behind the last. Everything else is undefined behavior. So a compiler may just "optimize" your >= to ==.

eridius 8 years ago
No it won't. It's using pointers, not array indices. The compiler has no possible way of knowing that `pe` is the one-past-the-end address.
- arestor 8 years ago
  
  It's still UB. The array could potentially be at the end of the address space...
  
  1 reply →
vbernat 8 years ago

It's not an array. It's inside a large buffer allocated by nginx.

pepve 8 years ago

That's a great defensive technique. But even when you do that, the underlying bug should still be fixed. I don't think the equality operator is the underlying bug.

Consecutive pointer increments without a bounds check in between sounds like a bug to me. But I don't really know Ragel, and perhaps the compiler doesn't have enough information to determine this is what's happening.

zzzcpan 8 years ago

Ragel is a low-level tool, it doesn't operate on bounds-checkable abstractions. But it does allow you to specify how to get the character, where you can do bounds-checking, if you need.

andrewf 8 years ago

Speculation: people may want to use Ragel-generated code in C++, where strict equality checks are idiomatic. p may be an iterator instead of a raw pointer. >= can be absent, or slower than ==.