Comment by gumby

5 years ago

> but without the downsides of the costly C++ memory deallocation on stack unwinding.

I.e. I don’t care about restoring the program to a known state when handling an error (memory deallocation is just one case of processing unwind blocks; locks need releasing,my file handles returned to kernel etc). This really only makes sense when your error “handling” is merely printing a user friendly error message and exiting.

(I'm the person he was quoting in the article.)

When I use setjmp/longjmp error handling I almost always want abort semantics but at the library level rather than at the OS process level. [1] Where applicable it's the simplest, most robust model I know. You have a context object that owns all your resources (memory blocks, file handles, etc) which is what lets you do simple and unified clean-up rather than fine-grained scoped clean-up in the manner of RAII or defer. You can see an example in tcc here:

https://github.com/LuaDist/tcc/blob/255ba0e8e34f999ee840407c...

https://github.com/LuaDist/tcc/blob/255ba0e8e34f999ee840407c...

[1] It goes without saying that a well-written library intended for general use is never allowed to kill the process. This presents a conundrum in writing systems-level C libraries. What do you do if something like malloc fails in a deep call stack within the library? Systems-level libraries need to support user-provided allocation functions which often work out of fixed-size buffers so failure isn't a fatal error from the application's point of view. You'd also want to use this kind of thing for non-debug assert failures for your library's internal invariants.

This style of setjmp/longjmp error handling works well for such cases since you can basically write the equivalent of xmalloc but scoped to the library boundary; you don't have to add hand-written error propagation to all your library functions just because a downstream function might have such a failure. I'm not doing this as a work-around for a lack of finally blocks, RAII or defer statements. It's fundamentally about solving the problem at a different granularity by erecting a process-like boundary around a library.

  • See my response to a parallel comment from dannas.

    I can see some minor corner cases where it could be worthwhile but the mental overhead isn't worth it.

    I've written plenty of realtime code but spending a lot of time on the code running in the interrupt handlers is mentally exhausting and error prone; I do that when I have no choice. Likewise I've written a lot of assembly code but it's been decades since I wrote a whole program that way -- I don't have enough fingers to keep track of all the labels and call paths.

    E.g. just because c++ has pointers doesn't mean I use them very often. >90% of the cases can be references instead.

More context to that quote: >Per Vognsen discusses how to do course-grained error handling in C using setjmp/longjmp. The use case there were for arena allocations and deeply nested recursive parsers. It’s very similar to how C++ does exception handling, but without the downsides of the costly C++ memory deallocation on stack unwinding.

I have never used setjmp/longjmp myself. And I agree with you that my first instinct would be to use it in the similar manner as in many GUI programs: they have have a catch statement in the message loop that shows a dialog box of the thrown exception. You just jump to a point where you print a user friendly error message and exit.

But I still can imagine use cases where you've isolated all other side effects (locks, shared memory, open file handles) and are just dealing with a buffer that you parse. Has anyone used setjmp/longjmp for that around here?

Given your many years in the field and Cygnus background I guess you've used it a few times? Do you happen to have any horror stories related to it? :-)

  • I hate setjmp/longjmp and have never needed it in production code.

    Think about how it works: it copies the CPU state (basically the registers: program counter, stack pointer, etc). When you longjmp back the CPU is set back to the call state, but any side effects in memory etc are unchanged. You go back in time yet the consequences of prior execution are still lying around and need to be cleaned up. It's as if you woke up, drove to work, then longjmped yourself back home -- but your car was still at work, your laptop open etc.

    Sure, if you're super careful you can make sure you handle the side effects of what happened while the code was running, but if you forget one you have a problem. Why not use the language features designed to take care of those problems for you?

    This sort of works in a pool-based memory allocator.

    The failures happen three ways: one is you forget something and so you have a leak. The second is that you haven't registered usage properly so have a dangling pointer. Third is by going back in time you lose access to and the value of prior and/or partial computation.

    If you use this for a library, and between the setjmp and longjmp is entirely in a single invocation you can sometimes get away with it. But in a thing like a memory allocator where the user makes successive calls, unless you force the user to do extra work you can't be sure what dependencies on the memory might exist. If your library uses callbacks you can be in a world of hurt.

    Trying to keep track of all those fiddly details is hard. C++ does it automatically, at the risk of potentially being more careful (e.g. deallocating two blocks individually rather than in one swoop -- oh, but that language has an allocator mechanism precisely to avoid this problem). The point is the programmer doesn't have to remember anything to make it work.